Informatica MDM Interview Questions
If you're looking for Informatica MDM Interview Questions for Experienced or Freshers, you are at right place. There are lot of opportunities from many reputed companies in the world. According to research Informatica MDM has a market share of about 3.0%. So, You still have opportunity to move ahead in your career in Informatica MDM Development. Mindmajix offers Advanced Informatica MDM Interview Questions 2018 that helps you in cracking your interview & acquire dream career as Informatica MDM Developer.
Q. What the term MDM means?
MDM stands for Master Data Management. It is a comprehensive method used to enable an enterprise for linking all of its critical data to single file also known as master file, providing a common point of reference. When done in a proper manner, MDM helps in streamlining the process of data sharing among departments and personnel.
Q. Describe all the biggest management and technical challenges in adopting MDM.
There is always a challenge for technical folks in data governance to sell the project and get the fund. There is always a look for ROI by management. They require MDM knotted to quantifiable benefits that are considered by business leaders such as dollar amounts around ROI.
Q. Define Data Warehousing.
It is the main depot of an organisation’s historical data and its corporate memory, containing raw material for the decision support system of management. What lead to the use of data warehousing is that it allows a data analyst to execute complex queried and analysis like data mining on the info without making any slow in operational system. Collection of data in Data warehousing is planned for supporting decision making of management. These warehouses contain an array of data presenting a coherent image of business conditions in time at a single point. Data Warehousing is a repository of information that is available for analysis and query.
Q. Define Dimensional Modeling.
There are two types of table involved in Dimensional Modeling and this model concept is different from the third normal form. Dimensional data model concept makes use of facts table containing the measurements of the business and dimension table containing the measurement context.
Q. Describe various fundamental stages of Data Warehousing.
There are various fundamental stages of Data warehousing. They are:
1. Offline Operational Databases: This is the first stage in which data warehouses are developed simply by copying operational system database to an offline server where the dealing out load of reporting not put any impact in the performance of operational system.
2. Offline Data Warehouse: In this stage of development, data warehouses are updates in regular basis from the operational systems. Plus, all the data is stored in an incorporated reporting-oriented data structure.
3. Real Time Data Warehouse: During this stage, data warehouses are updated on an event or transaction basis. Also, an operational system executes a transaction every time.
4. Integrated Data Warehouse: This is the last stage where data warehouses are used for generating transactions or activity passing back into the operational system for the purpose of use in an organizations daily activity.
Q. Define Informatica PowerCenter.
Designed by Informatica Corporation, it is data integration software providing an environment that lets data loading into a centralized location like data warehouse. From here, data can be easily extracted from an array of sources, also can be transformed as per the business logic and then can be easily loaded into files as well as relation targets.
Q. Name various components of Informatica PowerCenter.
There are various components of Informatica PowerCenter. They are as follows:
1. PowerCenter Repository
2. PowerCenter Domain
3. PowerCenter Client
4. Administration Console
5. Integration Service
6. Repository Service
7. Data Analyser
8. Web Services Hub
9. PowerCenter Repository Reports
10. Metadata Manager
Q. Explain Mapping.
Mapping can be best described as a set of target definitions and source connected with transformation objects defining data transformation rules. It represents the flow of data between targets and sources.
Q. Explain Mapplet.
It is a reusable object containing set of transformations and also allowing to reuse that transformation logic in a wide range of mappings.
Q. Explain Transformation.
It is a repository object that helps on generating, modifying or passing data. In a mapping, transformations make a representation of the operations integrated with service performs on the data. All the data goes by transformation ports that are only linked with a mapplet or mapping.
Q. Define Data Mining.
It is a process that helps in analysing data from several perspectives and also allows summarizing it into helpful information.
Q. Define Fact Table.
Fact table is the process containing measurement of business processes along with the foreign keys for dimension tables.
Checkout Informatica MDM Tutorials
Q. Define Dimension Table.
Dimension table is a compilation of categories, hierarchies and logic which can further be used for the traverse purpose in hierarchy nodes. It includes textual attributes of measurements that are stored in fact tables.
Q. Name the foreign key columns in dimension and fact table.
Dimension Table’s foreign keys are primary keys of entity tables
Fact Table’s foreign keys are primary keys of dimension tables
Q. Describe different methods to load dimension tables.
Two different methods are there for loading data in dimension tables. They are as follows:
>> Direct or fast: In this method, all the keys and constrains are disabled prior to loading the data. Once the complete data is loaded, it is legalized against all the keys and constrains. In case the data is found to be invalid then it will not be included to index. Plus, all the future processed on this data is skipped.
>> Conventional or slow: In this method, all the keys and constrains are legalized against prior to the data is loaded. In this way it helps in maintaining data integrity.
Q. Name various objects that can’t be used in mapplet.
There are a number of objects that you cannot use in a mapplet. They are:
1. Joiner transformations
2. COBOL source definition
3. Target definitions
4. IBM MQ source definitions
5. XML source definitions
6. Normaliser transformations
7. Non reusable sequence generator transformations
8. Power mart 3.5 style Look up functions
9. Post or pre session stored procedures
Q. Define different ways used in Informatica to migrate from one environment to another.
1. Repository can be imported and exported to the new environment
2. Informatica deployment groups can be used
3. Folders/objects can be copied
4. Each mapping can be exported to xml and then be imported in new environment
Q. What are the ways for deleting duplicate record in Informatica?
There are several ways for deleting duplicate record in Informatica. They are as follows:
1. Making use of select distinct in source qualifier
2. Making use of group and aggregator by all fields
3. By overriding SQL query in source qualifier
Q. Differentiate between variable and mapping parameter.
>> A Mapping variable is dynamic, i.e. it can vary anytime throughout the session. The variable’s initial value before the starting of the session is read by PowerCenter, which makes use of variable functions to change the value. And before the session ends, it saves the current value. However, the last value is held by the variable itself. Next time when the session runs, the value of the variable is the last saved value in the previous session.
>> A Mapping parameter is a static value, defined by you before the session starts and the value remains the same until the end of the session. Once the session runs, PowerCenter evaluates the parameter’s value and retains the same value during the entire session. Next time, when the session runs, it reads the value from the file.
Q. Describe various repositories that can be generated using Informatica Repository Manager.
There are various repositories that can be formed with the help of Informatica Repository Manager. They are as follows:
1. Standalone Repository: It is a repository functioning individually as well as is not related to any other repositories.
2. Local Repository: This repository functions within a domain. It is able to connect to a global repository with the help of global shortcuts. Also, it can make use of objects in its shared folders.
3. Global Repository: This repository works as a centralised repository in a domain. It contains shared objects crossways the repositories in a domain.
Q. What is the way to find all the invalid mappings in a folder?
By using a query all the invalid mappings in a folder can be found. It is:
SELECT MAPPING_NAME FROM REP_ALL_MAPPINGS WHERE
SUBJECT_AREA='YOUR_FOLDER_NAME' AND PARENT_MAPPING_IS_VALIED <>1
Q. Name various data movement modes in Informatica.
A data movement mode helps in determining how power center server takes care of the character data. Data movement is selected in the Informatica server configuration settings. There are two different data movement modes available in informatica. They are:
** Unicode Mode and ASCII Mode
** Explain OLAP.
** OLAP stands for Online Analytical Processing. It processes as an app helps that gathers, manages, presents and processes multidimensional data for management and analysis purposes.
Q. Explain OLTP.
OLTP stands for Online Transaction Processing that helps in modifying data the example it receives as well as having a huge number of concurrent users.
Q. Describe the parallel degree of data loading properties in MDM.
This specifies the parallelism’s degree that is set upon the base object table as well as its related tables. Although it doesn’t occur for all batch processes, it can have a positive consequence on performance once it’s used. Nevertheless, its use is restricted by the number of CPUs on the database server machine along with the amount of available memory. 1 is the default value.
Q. Explain various types of LOCK used in Informatica MDM 10.1.
Two types of LOCK are used in Informatica MDM 10.1. They are:
1. Exclusive Lock: Letting just one user to make alterations to the underlying operational reference store.
2. Write Lock: Letting multiple users to make amendments to the underlying metadata at the same time.
Q. What is the expiration module of automatic lock in Informatica MDM?
In every 60 seconds, the hub console is refreshed in the current connection. A lock can be released manually by a user. In case the user switches to other database while having a hold of a lock, then the lock will be released automatically. In case the hub console is terminated by the user, then the lock will be expired after a minute.
Q. Name the tool which does not require Lock in Informatica MDM.
Merge manager, data manager and hierarchy manager do not demand for write locks. Besides. The audit manager also does not need write locks.
Q. Name various tools that require LOCK in Informatica MDM.
There are several tools that require LOCK to make configuration changes to the database of MDM Hub Master. They are:
1. Message Queues
2. Tool Access
4. Security Providers
6. Repository Manager
Q. Name the tables that are linked with staging data in Informatica MDM.
There are various tables that are linked with staging data in Informatica MDM. They are:
1. Landing Table
2. Raw Table
3. Rejects Table
4. Staging Table
List of Informatica Courses:
Mindmajix offers training for many of other informatica courses depends on your requirement:
This article tries to minimize hard-coding in ETL, thereby increasing flexibility, reusability, readabilty and avoides rework through the judicious use of Informatica Parameters and Variables.
Step by step we will see what all attributes can be parameterized in Informatica from Mapping level to the Session, Worklet, Workflow, Folder and Integration Service level.
Parameter files provide us with the flexibility to change parameter and variable values every time we run a session or workflow.
So, let us begin the journey!
Parameter File in Informatica
- A parameter file contains a list of parameters and variables with their assigned values.
- [Global] [Folder_Name.WF:Workflow_Name.WT:Worklet_Name.ST:Session_Name] [Session_Name]
- $PMBadFileDir=<null> $PMCacheDir=
In the specified session name, the value for session parameter $DBConnection_TGT is Orcl_SG and for rest all other sessions in the workflow, the connection object used will be Orcl_Global.
Scope of Informatica Parameter File
Next we take a quick look on how we can restrict the scope of Parameters by changing the Parameter File Heading section.
- [Global] -> All Integration Services, Workflows, Worklets, Sessions. [Service:IntegrationService_Name] -> The Named Integration Service and Workflows, Worklets, Sessions that runs under this IS. [Service:IntegrationService_Name.ND:Node_Name] [Folder_Name.WF:Workflow_Name] -> The Named workflow and all sessions within the workflow. [Folder_Name.WF:Workflow_Name.WT:Worklet_Name] -> The Named worklet and all sessions within the worklet. [Folder_Name.WF:Workflow_Name.WT:Worklet_Name.WT:Nested_Worklet_Name] -> The Named nested worklet and all sessions within the nested worklet. [Folder_Name.WF:Workflow_Name.WT:Worklet_Name.ST:Session_Name] -> The Named Session. [Folder_Name.WF:Workflow_Name.ST:Session_Name] -> The Named Session. [Folder_Name.ST:Session_Name] -> The Named Session. [Session_Name] -> The Named Session.
Types of Parameters and Variables
There are many types of Parameters and Variables we can define. Please find below the comprehensive list:
- Service Variables: To override the Integration Service variables such as email addresses, log file counts, and error thresholds. Examples of service variables are $PMSuccessEmailUser, $PMFailureEmailUser, $PMWorkflowLogCount, $PMSessionLogCount, and $PMSessionErrorThreshold. Service Process Variables: To override the the directories for Integration Service files for each Integration Service process. Examples of service process variables are $PMRootDir, $PMSessionLogDir and $PMBadFileDir. Workflow Variables: To use any variable values at workflow level. User-defined workflow variables like $$Rec_Cnt Worklet Variables: To use any variable values at worklet level. User-defined worklet variables like $$Rec_Cnt. We can use predefined worklet variables like $TaskName.PrevTaskStatus in a parent workflow, but we cannot use workflow variables from the parent workflow in a worklet. Session Parameters: Define values that may change from session to session, such as database connections, db owner, or file names. $PMSessionLogFile, $DynamicPartitionCount and $Param_Tgt_Tablename are user-defined session parameters. List of other built in Session Parameters:
$PMFolderName, $PMIntegrationServiceName, $PMMappingName, $PMRepositoryServiceName, $PMRepositoryUserName, $PMSessionName, PMSessionRunMode [Normal/Recovery], $PM_SQ_EMP@numAffectedRows, $PM_SQ_EMP@numAppliedRows, $PM_SQ_EMP@numRejectedRows, $PM_SQ_EMP@TableName, $PM_TGT_EMP@numAffectedRows, $PM_TGT_EMP@numAppliedRows, $PM_TGT_EMP@numRejectedRows, $PM_TGT_EMP@TableName, $PMWorkflowName, $PMWorkflowRunId, $PMWorkflowRunInstanceName.
Note: Here SQ_EMP is the Source Qualifier Name and TGT_EMP is the Target Definition.Mapping Parameters: Define values that remain constant throughout a session run. Examples are $$LOAD_SRC, $$LOAD_DT. Predefined parameters examples are $$PushdownConfig. Mapping Variables: Define values that changes during a session run. The Integration Service saves the value of a mapping variable to the repository at the end of each successful session run and uses that value the next time you run the session. Example $$MAX_LOAD_DT
Difference between Mapping Parameters and Variables
A mapping parameter represents a constant value that we can define before running a session. A mapping parameter retains the same value throughout the entire session. If we want to change the value of a mapping parameter between session runs we need to Update the parameter file.
A mapping variable represents a value that can change through the session. The Integration Service saves the value of a mapping variable to the repository at the end of each successful session run and uses that value the next time when we run the session. Variable functions like SetMaxVariable, SetMinVariable, SetVariable, SetCountVariable are used in the mapping to change the value of the variable. At the beginning of a session, the Integration Service evaluates references to a variable to determine the start value. At the end of a successful session, the Integration Service saves the final value of the variable to the repository. The next time we run the session, the Integration Service evaluates references to the variable to the saved value. To override the saved value, define the start value of the variable in the parameter file.
Parameterize Connection Object
First of all the most common thing we usually Parameterise is the Relational Connection Objects. Since starting from Development to Production environment the connection information obviously gets changed. Hence we prefer to go with parameterisation rather than to set the connection objects for each and every source, target and lookup every time we migrate our code to new environment.E.g.
- $DBConnection_SRC $DBConnection_TGT
If we have one source and one target connection objects in your mapping, better we relate all the Sources, Targets, Lookups and Stored Procedures with $Source and $Target connection. Next we only parameterize $Source and $Target connection information as:
- $Source connection value with the Parameterised Connection $DBConnection_SRC $Target connection value with the Parameterised Connection $DBConnection_TGT
Lets have a look how the Parameter file looks like. Parameterization can be done at folder level, workflow level, worklet level and till session level.[WorkFolder.WF:wf_Parameterize_Src.ST:s_m_Parameterize_Src] $DBConnection_SRC=Info_Src_Conn $DBConnection_TGT=Info_Tgt_Conn
Here Info_Src_Conn, Info_Tgt_Conn are Informatica Relational Connection Objects.
Note: $DBConnection lets Informatica know that we are Parameterizing Relational Connection Objects.
For Application Connections use $AppConnection_Siebel, $LoaderConnection_Orcl when parameterizing Loader Connection Objects and $QueueConnection_portal for Queue Connection Objects.
In a precise manner we can use Mapping level Parameter and Variables as and when required. For example $$LOAD_SRC, $$LOAD_CTRY, $$COMISSION, $$DEFAULT_DATE, $$CDC_DT.
Parameterize Source Target Table and Owner Name
Situation may arrive when we need to use a single mapping from various different DB Schema and Table and load the data to different DB Schema and Table. Condition provided the table structure is the same.
A practical scenario may be we need to load employee information of IND, SGP and AUS and load into global datawarehouse. The source tables may be orcl_ind.emp, orcl_sgp.employee, orcl_aus.emp_aus.
So we can fully parameterize the Source and Target table name and owner name.
- $Param_Src_Tablename $Param_Src_Ownername $Param_Tgt_Tablename $Param_Tgt_Ownername
The Parameterfile:-[WorkFolder.WF:wf_Parameterize_Src.ST:s_m_Parameterize_Src] $DBConnection_SRC=Info_Src_Conn $DBConnection_TGT=Info_Tgt_Conn $Param_Src_Ownername=ODS $Param_Src_Tablename=EMPLOYEE_IND $Param_Tgt_Ownername=DWH $Param_Tgt_Tablename=EMPLOYEE_GLOBAL
Check the implementation image below:
Parameterize Source Qualifier Attributes
Next comes what are the other attributes we can parameterize in Source Qualifier.
- Sql Query: $Param_SQL Source Filter: $Param_Filter Pre SQL: $Param_Src_Presql Post SQL: $Param_Src_Postsql
If we have user-defined SQL statement having join as well as filter condition, its better to add a $$WHERE clause at the end of your SQL query. Here the $$WHERE is just a Mapping level Parameter you define in your parameter file.
In general $$WHERE will be blank. Suppose we want to run the mapping for todays date or some other filter criteria, what you need to do is just to change the value of $$WHERE in Parameter file.$$WHERE=AND LAST_UPDATED_DATE > SYSDATE -1 [WHERE clause already in override query] OR $$WHERE=WHERE LAST_UPDATED_DATE > SYSDATE -1 [NO WHERE clause in override query]
Parameterize Target Definition Attributes
Next what are the other attributes we can parameterize in Target Definition.
- Update Override: $Param_UpdOverride Pre SQL: $Param_Tgt_Presql Post SQL: $Param_Tgt_Postsql
Parameterize Flatfile Attributes
Now lets see what we can do when it comes to Source, Target or Lookup Flatfiles.
- Source file directory: $PMSourceFileDir\ [Default location SrcFiles] Source filename: $InputFile_EMP Source Code Page: $Param_Src_CodePage Target file directory: $$PMTargetFileDir\ [Default location TgtFiles] Target filename: $OutputFile_EMP Reject file directory: $PMBadFileDir\ [Default location BadFiles] Reject file: $BadFile_EMP Target Code Page: $Param_Tgt_CodePage Header Command: $Param_headerCmd Footer Command: $Param_footerCmd Lookup Flatfile: $LookupFile_DEPT Lookup Cache file Prefix: $Param_CacheName
Parameterize FTP Connection Object Attributes
Now for FTP connection objects following are the attributes we can parameterize:
- FTP Connection Name: $FTPConnection_SGUX Remote Filename: $Param_FTPConnection_SGUX_Remote_Filename [Use the directory path and filename if directory is differnt than default directory] Is Staged: $Param_FTPConnection_SGUX_Is_StagedIs Transfer Mode ASCII:$Param_FTPConnection_SGUX_Is_Transfer_Mode_ASCII
Parameterization of Username and password information of connection objects are possible with $Param_OrclUname.
When it comes to password its recommended to Encrypt the password in the parameter file using the pmpasswd command line program with the CRYPT_DATA encryption type.
Using Parameter File
We can specify the parameter file name and directory in the workflow or session properties or in the pmcmd command line.
We can use parameter files with the pmcmd startworkflow or starttask commands. These commands allows us to specify the parameter file to use when we start a workflow or session.
The pmcmd -paramfile option defines which parameter file to use when a session or workflow runs. The -localparamfile option defines a parameter file on a local machine that we can reference when we do not have access to parameter files on the Integration Service machine
The following command starts workflow using the parameter file, param.txt:pmcmd startworkflow -u USERNAME -p PASSWORD -sv INTEGRATIONSERVICENAME -d DOMAINNAME -f FOLDER -paramfile 'infa_shared/BWParam/param.txt' WORKFLOWNAME
The following command starts taskA using the parameter file, param.txt:pmcmd starttask -u USERNAME -p PASSWORD -sv INTEGRATIONSERVICENAME -d DOMAINNAME -f FOLDER -w WORKFLOWNAME -paramfile 'infa_shared/BWParam/param.txt' SESSION_NAME
Workflow and Session Level Parameter File
When we define a workflow parameter file and a session parameter file for a session within the workflow, the Integration Service uses the workflow parameter file, and ignores the session parameter file. What if we want to read some parameters from Parameter file at Workflow level and some defined at Session Level parameter file.
The solution is simple:
- Define Workflow Parameter file. Say infa_shared/BWParam/param_global.txt Define Workflow Variable and assign its value in param_global.txt with the session level param file name. Say $$var_param_file=/infa_shared/BWParam/param_runtime.txt In the session properties for the session, set the parameter file name to this workflow variable. Add $PMMergeSessParamFile=TRUE in the Workflow level Parameter file.
Content of infa_shared/BWParam/param_global.txt[WorkFolder.WF:wf_runtime_param] $DBConnection_SRC=Info_Src_Conn $DBConnection_TGT=Info_Tgt_Conn $PMMergeSessParamFile=TRUE $$var_param_file=infa_shared/BWParam/param_runtime.txt
Content of infa_shared/BWParam/param_runtime.txt[WorkFolder.wf:wf_runtime_param.ST:s_m_emp_cdc] $$start_date=2010-11-02 $$end_date=2010-12-08
The $PMMergeSessParamFile property causes the Integration Service to read both the session and workflow parameter files.