Retrieve Records Modified After Last Run DateTime

From Starfish ETL
Jump to: navigation, search

To only run data modified after the last time Starfish was run, we need to do two things:

  1. Set the last time Starfish ran.
  2. Filter the origin based on the last time Starfish ran.

Set Last Run DateTime

Create an After Operation Procedure and run the built in Starfish Function SetLastRunDate. Set Exec When to "Once After Conn". It is necessary to add this, otherwise the last successful run date will not get set.

Sub VBScriptProcedure
	SetLastRunDate
End Sub

Filter Origin Based On Last Run DateTime

First you must retrieve the DateTime your job was last run. This can be inserted directly into the Origin Filter by using the @@VAR:LastRunDate@@ variable. An example of that in a SQL origin would look something like this:

SELECT * FROM Contacts WHERE ModifyDate >= '@@VAR:LastRunDate@@'

In many applications, DateTimes are converted to GMT in database, so you must convert your last run DateTime to GMT before inserting it into your Origin Filter. To do this, Run a Before Operation set to Exec When "Once Before Conn". You could create any number of functions, but this is the one I've used:

Function ScriptedVariable
	Dim gmtLastRun
	gmtLastRun = CDate("@@VAR:LastRunDate@@")
	gmtLastRun = DateAdd("h", 7, gmtLastRun)
	'Sometimes, the Starfish server and the Source Server clocks are not quite in sync, so I subtract 1 minute from my last run datetime to guarantee I don't miss any records.
	gmtLastRun = DateAdd("n", -1, gmtLastRun)
	ScriptedVariable=FormatDate(gmtLastRun, "yyyy-MM-dd HH:mm:ss")
End Function

Then insert the resulting date into your Origin Filter:

cast(integration_modify_date_c as datetime) >= '@@VAR:GMTLastRunDate@@'

or

[{"date_modified":{"$gte":"@@VAR:GMTLastRunDate@@"}}]