CSV

From Starfish ETL
Jump to navigation Jump to search

Origin

We've seen some display issues with the CSV connector. The Connection String sample is not quite correct. It should read:

Data Source="C:\Starfish\Folder";Extended Properties="Text;HDR=Yes;"

Your SqL Selection Statement then selects from the filename:

SELECT * FROM districts.csv

In addition, there should be a "Select File..." button just above the Connection String box and all the way to the right. Sometimes, this doesn't display.

In version 5 and higher, you need quotes around the filename in the selection statement:

SELECT * FROM "districts.csv"

The file needs to be in a location that the Starfish app pool can read from. Either grant access to the directory for "IIS_IUSRS" or put the CSV file under C:\inetpub\wwwroot\StarfishEngine

Issues

Type mismatch issue where column contains some alpha numeric entries but Starfish believes the data is numeric.

If a column of data contains numeric data for the first few rows and then later contains alphanumeric data Starfish will expect numeric data from every row in the data and will then fail to parse the alphanumeric data. Because by default the Microsoft Jet driver that is used for this functionality only scans the first 16 rows, it tries to make a ‘guess’ based on that data for each column type. This is only a problem is the data changes from numeric to alphanumeric partway through the data as thinks the data is all numbers. In registry, navigate to: HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Text Find the MaxScanRows key, and change the value to ‘0’. Restart Starfish. It may also be necessary to perform an IISRESET.