File System Origin

The File System Connector reads all files from a folder specified in the Directory field. The Username and Password fields are not used. Filter can be used to filter for certain file names/extensions. To search for all files with an extension of "txt, use the filter "*.txt". To search for all files with "Work" anywhere in the filename, use the filter "*Work*"

The Connector returns standard data in the first 12 fields returned:

CreationTime
FullDirectoryName
DirectoryRelative
DirectoryName
Extension
FullName
IsReadOnly
LastAccessTime
LastWriteTime
Length
Name
LastWriteTime

The Connector returns all information available for the File in json format in the ExtendedProperties field. Rather than add a column for each potential property, we went with a more dynamic method of using one column called “ExtenededProperties” and formatted the data as JSON.

It’s very easy to parse this and then pull out the data you need in Starfish.

First, create a Before Each Row VBScript Operation which loads the chunk of JSON into the parser:

Sub VBScriptProcedure
    ParseJSON("@@ORG:ExtendedProperties@@")
End Sub

Then on your field, pull out the value you want by name..

Function ScriptedField   
    ScriptedField=GetJSON("Authors")
End Function

Function ScriptedField   
    ScriptedField=GetJSON("Title")
End Function

etc..
If a value doesn’t exist for a particular file, it’ll just return “”.

Reading and manipulating a File

The connector itself can not read a file, but vbScript or C# code can be used to read and manipulate files.

Reading a file:

Function ScriptedField
    Dim text
    text = ExtractText("C:\test.txt")
    ScriptedField = text
End Function

Moving a file:

Function ScriptedField
    MoveFile("C:\TestDir1\test.txt","C:\TestDir2\test.txt")
End Function

You can also rename the file with this method.

Processing a huge amount of files

Processing a huge amount of files can cause performance issues. We started with a folder with 40,000 tiny .txt files in it and it ran the first 10,000 or so without problem and then slowed to a crawl. The solution for me was to process a file and then move the file to a new folder. In this way, I could easily process the files in increments of 5000.

File System

File System Origin

Reading and manipulating a File

Processing a huge amount of files

Navigation menu