There is some interesting behaviour that I noticed recently in Informatica's repository with respect to session's run time information. If you are not much comfortable with Informatica repository that I now I am going to explain about, here is some quick notes.
Informatica entire application lies on a data repository that consists of several tables (mostly start with OPB*). Perhaps in order to make an ease access to repository information, Informatica also provided several views based on the OPB tables. For many of the common questions you may want the answer REP views should help, bu there is always an exception and OPB tables are there. When you write queries on OPB tables take due care that you do not ruin database with bad joins - for this Informatica manual gives the columns that generally good candidates for joins when using REP views, but generally the same information applies to joining tables as well since a view is just a named query. Some useful views are... REP_WORKFLOWS - Workflow master list REP_WFLOW_RUN - Worklow run time details REP_ALL_TASKS - Master task list REP_TASK_INST_RUN - Task level run time details REP_SESS_LOG - Session's Success/Failed records etc...
With those details, let me explain the what the prob
em is in 7.1. Workflow_Run_ID uniquely identifies each run of a workflow - means when a workflows starts (starting part of the workflow or full workflow or a schedule run) it would have a new workflow_run_id.
For suspended workflow (enabled "Suspend on Failure"), common way we recover it is by resuming it after the fix is provided (in 8.6 option name is "Recover"), and resuming it obviously not a new run so doesn't carry a new worklflow_run_id after resumed. It also means that there would not be a new entry into REP_WFLOW_RUN (or OPB_WFLOW_RUN table) when a workflow is resumed.
Moving a little back, a workflow goes to suspended state when it's meant to fail by the configuration of underlying tasks - means when underlying task fails that makes it's parent (or workflow in a simple setup) to fail and so workflow suspends. So, when such suspended workflow is resumed that actually restarts the failed tasks in the run. I noticed, in 7.1, even though resuming does not create a new workflow_run_id, tasks that restarts when its parent workflow is resumed would create new entries in REP_TASK_INST_RUN. That might be for good reason, but I see that of not much help rather misleading as there is no column that distinguishes task restarts (start_time of task can give that info, but that's not a good candidate).
Moreover, Informatica 7.1 behaves badly in maintaining information in those records - if a task is failed, say 3 times, when resumed multiple times, and finally succeeded in 4th attempt, then all the 4 records of that task run would get updated with last run information (except start_time that I identified) - end_time, error_code etc...and I would say this is completely misleading. Similar behaviour can be noticed in other tables/views that stores the task level run time information like REP_SESS_LOG (OPB_SESS_LOG).
If you run this below query on 7.1 repository you would notice some records being fetched (provided there was atleast one attempt of resuming a suspended workflow)
select workflow_run_id, instance_id, count(*) from infav71.rep_task_inst_run group by workflow_run_id, instance_id having count(*) > 1Even more dangerous part is with the views that joins OPB_TASK_INST_RUN and OPB_SESS_LOG tables. As I said, there is no good candidate in those tables to identify the multiple task (or instance to be precise) runs when suspended workflow is resumed, joining these 2 tables would make cartesian product - means 4 entries in both these 2 tables would return 8 records - that's a flaw, and you can notice that in 7.1.
select * from infa71.OPB_TASK_INST_RUN where INSTANCE_ID = 18738 and WORKFLOW_RUN_ID = 7113806; SUBJECT_ID WORKFLOW_ID WORKFLOW_RUN_ID WORKLET_RUN_ID CHILD_RUN_ID INSTANCE_ID INSTANCE_NAME TASK_NAME TASK_ID TASK_TYPE START_TIME END_TIME RUN_ERR_CODE RUN_ERR_MSG RUN_STATUS_CODE RUN_MODE VERSION_NUMBER SERVER_ID SERVER_NAME 48 13707 7113806 7113823 0 18738 s_m_test s_m_test 20461 68 8/24/2009 2:00 8/24/2009 2:00 0 1 1 1 1 TEST_SRVR 48 13707 7113806 7113823 0 18738 s_m_test s_m_test 20461 68 8/24/2009 1:47 8/24/2009 2:00 0 1 1 1 1 TEST_SRVR 48 13707 7113806 7113823 0 18738 s_m_test s_m_test 20461 68 8/24/2009 1:14 8/24/2009 2:00 0 1 1 1 1 TEST_SRVR 48 13707 7113806 7113823 0 18738 s_m_test s_m_test 20461 68 8/24/2009 0:52 8/24/2009 2:00 0 1 1 1 1 TEST_SRVR 48 13707 7113806 7113823 0 18738 s_m_test s_m_test 20461 68 8/24/2009 0:17 8/24/2009 2:00 0 1 1 1 1 TEST_SRVR 48 13707 7113806 7113823 0 18738 s_m_test s_m_test 20461 68 8/24/2009 0:02 8/24/2009 2:00 0 1 1 1 1 TEST_SRVR
If you look at the above result only start_time is differing, and values of older runs are updated to last run values (error_code, end_time etc... though they have different values from run to run)
Looks like it's fixed in 8.6 - I did not test in version between 7.1 and 8.6 though. In 8.6 however number of resuming of a suspended workflow may be, a failed task instance would have only one run per workflow_run_id. This is much better, rather misleading with more information. if you run the first group by query in your 8.6 repository, you should not be getting any records at all.
select workflow_run_id, instance_id, count(*) from infav88.rep_task_inst_run group by workflow_run_id, instance_id having count(*) > 1 0 row(s) retrieved
Hope that helps.
I appreciate your comments... Thanks!
Karteek
Labels: bi, bug, informatica, repository
2 Comments:
-
- Anonymous said...
7:56 AMEntzГјckend levitra preis cialis 20mg tadalafil [url=http//t7-isis.org]viagra ohne rezept[/url]- Anonymous said...
10:43 PMla informaciГіn muy buena [url=http://csalamanca.com/comprar-viagra-generico/ ]comprar viagra espana [/url] lo habГ©is dicho correctamente:) http://csalamanca.com/category/viagra/ la viagra