DATA ARCHITECT VS DATA WAREHOUSE ARCHITECT VS BI SOLUTION ARCHITECT

What is the difference between a data warehouse architect and a data architect? How about BI solution architect?

DW

Both data architect and data warehouse architect do data modelling, as in ERWin stuff. Or Embarcadero ER Studio. But a data warehouse architect is more specialised on designing the data model for a data warehouse, whereas a data architect is more specialised on designing the data model for databases used by transactional systems.

A data warehouse architect does a lot more than just data modelling. They also does the ETL and the infrastructure. These are areas that a data architect doesn’t do normally.

For a data architect to be able to call themselves a data warehouse architect, they don’t only need to learn how to create a data model for a data warehouse (as in dimensional modelling). But they need to also understand the ETL architecture. And they need to understand the servers. For example, they need to be able to specify the specification for the production data warehouse servers, i.e. CPU, memory, disks. And other server stuff like clustering, mirroring and DR. And they need to understand physical database stuff too, like table partitioning, file groups and materialised views.

In my book I specify there are 2 sides of data warehouse architecture. The first one is the logical architecture and the second one is physical architecture. A warehouse architect needs to understand both.

A data warehouse architect in my opinion demands 4 separate skills: data architecture, ETL, database platform and physical infrastructure. By “database platform” I mean SQL Server knowledge, Oracle knowledge, Teradata knowledge, Netezza knowledge, etc. For example, “SQL Server 2008 Data Warehousing Features” is a “must know” for a DW architect in SQL Server. Whereas UPI & USI is a “must know” for DW architect in Teradata. If we design a DW on Oracle 11g R2, we need to know Oracle specific DW features, such as Initialisation Parameter Settings, Partitionwise Joins, Compression, Parallel Execution, etc.

A BI architect is more on the application side, as in SharePoint architecture, Hyperion architecture, Reporting Services architecture, and Analysis Services architecture. For example: a BI system where we have reporting services running on SharePoint, plus excel services and PPS services on SharePoint, and SSAS cubes too reading from a warehouse or mart. And on top of that some custom .NET coding for authentication or customised security. Plus they allow some self-service BI using Qlikview or PowerPivot.

Back to the data warehouse architect, the ETL aspect of the job is sometimes quite demanding. There is something called ETL architecture in warehousing, which is basically

a) the architecture of the overnight batch, i.e. the structure of the tasks and workflows, the execution order of the workflows, the backup, the reconciliation, the checking and alert, and the data quality. The overnight batch is not only about data loading / ETL. It also has: data serving elements, i.e. processing reports (stored as PDFs, ready to be served to achieve split second response time), refreshing OLAP cubes.

b) the architecture of the continuous feed throughout the day for real time warehousing

c) the physical infrastructure of the ETL, i.e. the servers, the databases, the data connections

d) the methods to extract and load the data i.e. sliding window, swap partition, flipping twin tables, identifying incremental extraction, changed data capture, change tracking mechanism, (filtered) replication between warehouse and mart (or mirroring), how to extract data from cubes.

If a data warehouse architect only understands dimensional modelling, the company will have problems in the ETL and infrastructure. The servers might not be ‘suit for purpose’, for example disk configuration is not optimised for warehousing.

How about “BI solution architect”? A data warehouse is the back end. Business Intelligence is the front end. Data warehousing is about the data model, the ETL and the databases. BI is about the reports, the OLAP cubes, the analytical applications, the data mining, the KPIs, the dashboards, the score cards, and the performance management. It is a common perception that a BI solution architect is a “front end” person. An application person. This perception is incorrect.

A “solution architect” is responsible for the whole solution. Not only the front end, but also the back end. It is impossible for him to be responsible for the whole solution without looking after the back end. In the case of a BI solution architect, he or she needs to look after both the BI front end (reports, cubes, performance management, etc.), and the back end (data warehouse, ETL). It is impossible for a BI solution architect to be responsible for the BI without looking after the data warehouse. In some companies, instead of calling the role “BI solution architect”, it is called “data warehouse solution architect”. It doesn’t mean that the role is only responsible for the back end data warehouse, but he is also responsible for the BI front end. In some companies, they have not only one but several BI solution architects, each responsible for a certain area.

There are several different types of architects in IT, for example: system architect, data architect, solution architect, information architect and enterprise architect. This segregation of duties only happens in very large group of companies (enterprises). For example, a banking group with 3000 IT staff. If the IT staff is only 100 usually the architect does multiple functions. A system architect is responsible for infrastructure, including networks and servers. I’ve mentioned about data architect and solution architect. An information architect is responsible for the flow of information throughout the enterprise, including databases and data quality. An enterprise architect is responsible for all the applications in the enterprise, making sure they run in sync and adhere to the standards.

Source: https://dwbi1.wordpress.com/2010/06/16/data-architect-data-warehouse-architect-and-bi-solution-architect/

Advertisements

OBIEE 12c – INSTALLATION GUIDE PART I – PREREQUISITES

Need to make sure that the JDK 8 is already installed. If not then you need to download it and install.

obieeinstall1

Download the OBIEE 12c softwares and unzip it.

obieeinstall2

The first step is to install the Fusion Middleware 12c Infrastructure.

To start the installtion use the following command:

[oracle@apps12c OBIEE12c]$ java -d64 -jar fmw_12.2.1.0.0_infrastructure.jar

Welcome screen will appear. Just click Next to continue.

obieeinstall3

The next screen asks if you want auto-updates. Select Skip Auto Updates and click Next.

obieeinstall4

Next you need to specify the home location for Fusion Middleware, e.g. /apps/biee12c

obieeinstall5

The next screen asks if you want Middleware installed with examples. Select Fusion Middleware Infrastructure and click Next.

obieeinstall6

The installer will then perform prerequisite checks. Click Next.

obieeinstall7

Next screen, uncheck the check box and click Next.

obieeinstall8

Now the installation is ready to begin.

obieeinstall9

Click Install and let it run through. It does not take long time.

obieeinstall10

Once finished, you should see the summary screen. Click Finish to close the windows.

obieeinstall11

OBIEE 12c – OBIS1 ISSUE AFTER MODIFYING RPD

We have noticed that the BI Server goes down some times and does not come up after carrying out RPD changes. We tested some cases in our OBIEE 12c environment and that is a Oracle bug of OBIEE 12c until now (so far the latest version of OBIEE 12c is 12.2.1.0.0).

The error message you can find like below:

Unable to start the server obis1 : Received error message from Node Manager Server: [Server start command for OBIS server ‘obis1’ failed due to:

[Server failed to start up but Node Manager was not aware of the reason]. Please check Node Manager log and/or server ‘obis1’ log for detailed information.]. Please check Node Manager log for details.

In the mean time the issues are solved by Oracle, there is a solution which you can work around and overcome this issue by restoring the default RPD and then deploying the latest version of RPD. Of course, you need to make sure that you already backup the latest changed RPD.

Below are detail steps:

  • Navigate to the directory <obiee_home>/user_projects/domains/bi/bitools/bin/ and run the command: sh stop.sh to stop all services.
  • After the OBIEE 12c services are stopped, navigate to the directory <obiee_home>/user_projects/domains/bi/bidata/service_instances/ssi/metadata/datamodel/customizations and delete liverpd.* and all other versions of RPD.
  • Clear the content of file default_diff.xmlin the directory <obiee_home>/user_projects/domains/bi/bidata/service_instances/ssi/metadata/datamodel/customizations/default (Note: Please do not delete this file)
  • Navigate to the directory <obiee_home>/user_projects/domains/bi/bitools/bin/ and run the command: nohup ./start.sh & to start all OBIEE 12c services in the background. You can type the command: tail -f nohup.out to view the starting progress
  • The system should come up with default RPD. Login to OBIEE 12c as weblogic user to ensure that you can login
  • Now you need to deploy your latest changed RPD using data-model-cmd.sh script in <obiee_home>/user_projects/domains/bi/bitools/bin

Hope this help.

ORACLE BIAPPS – CONFIGURE SCD BEHAVIOR FOR CUSTOM DIMENSION COLUMN

In Oracle Business Intelligence Applications, customization is defined as changing the preconfigured behavior to enable you to analyze new information in your business intelligence dashboards. For example, you might want to add a column to a dashboard by extracting data from the source field and storing it in the Oracle Business Analytics Warehouse in the X_<Column_Name> field.

For added custom dimension column, you must configure SCD behavior after reversing it into ODI model so that the load plan can populate data to that column. The following describes how to configure this:

  1. In ODI Designer, modify the dimension datastore. In the Models view, expand the ‘Oracle BI Applications’ folder, Oracle BI Applications (Model), Dimension (Submodel), and Columns.
  2. Double-click the column whose SCD behavior you want to change.
  3. In the Description subtab’s ‘Slowly Changing Dimensions Behavior’ drop-down list, select the column behavior. To implement Type I behavior, select Overwrite on Change. To implement Type II behavior, select Add Row on Change.

scd1

scd2

ORACLE BIAPPS – HOW TO ENABLE SOFT DELETE PROCESS

What is the issue?

During the OBI Apps implementation and customization projects we were confronted with an uncommon issue. Even though we had run many validation processes and had acquired the agreed approvals within the Development, SIT and UAT environments, some metric issues appeared after going live with the Production environment.

One of the issues is the metric shown within the Oracle BI report was different from (normally greater than) the real number found in the data sources (i.e. EBS). The reason is some records were manually deleted in the EBS but not be reflected in the Data Warehouse.

In case you want to flag these records as deleted (soft delete) in the Data Warehouse, you must enable the related primary extract and delete mappings because this feature is disabled by default.

About Primary Extract and Delete Mappings Process

The primary extract mappings perform a full extract of the primary keys from the source system. Although many rows are generated from this extract, the data only extracts the Key ID and Source ID information from the source table. The primary extract mappings load these two columns into staging tables that are marked with a *_PE suffix.

The figure below provides an example of the beginning of the extract process. It shows the sequence of events over a two day period during which the information in the source table has changed. On day one, the data is extracted from a source table and loaded into the Oracle Business Analytics Warehouse table. On day two, Sales Order number three is deleted and a new sales order is received, creating a disparity between the Sales Order information in the two tables.

1

Above figure shows the primary extract and delete process that occurs when day two’s information is extracted and loaded into the Oracle Business Analytics Warehouse from the source. The initial extract brings record four into the Oracle Business Analytics Warehouse. Then, using a primary extract mapping, the system extracts the Key IDs and the Source IDs from the source table and loads them into a primary extract staging table.

The extract mapping compares the keys in the primary extract staging table with the keys in the most current the Oracle Business Analytics Warehouse table. It looks for records that exist in the Oracle Business Analytics Warehouse but do not exist in the staging table (in the preceding example, record three), and sets the delete flag to Y in the Source Adapter, causing the corresponding record to be marked as deleted.

The extract mapping also looks for any new records that have been added to the source, and which do not already exist in the Oracle Business Analytics Warehouse; in this case, record four. Based on the information in the staging table, Sales Order number three is physically deleted from Oracle Business Analytics Warehouse, as shown in following figure. When the extract and load mappings run, the new sales order is added to the warehouse.

2

The following graph describes how the Primary Extract and Delete mappings interact with the database tables:

3

  • The _Primary mappings perform a full extract of the primary keys from EBS source system and load the result into the primary extract (_F_PE) table.
  • The _IdenfifyDelete mappings identify deleted records in the source by doing comparison between primary extract table (_F_PE) and the target table (_F) and load the results into a staging table (_F_DEL).
  • The _SoftDelete mappings update the delete flag column with a value ‘Y’ on the target table (_F) for all the records that were identified as ‘deleted’, driving from the staging area table (_F_DEL).

Enabling Soft Delete Process in Oracle BI Apps 11.1.1.8.1

In order to enable the Primary Extract and Delete mappings you will have to apply changes to the SOFT_DELETE_PREPROCESS data load parameter using Oracle BI Applications Configuration Manager (BIACM).

Here is the list of steps required:

  1. Log in to BIACM as the BI Applications Administrator user.
  2. Select the Manage Data Load Parameters link to display the Manage Data Load Parameter dialog.
  3. Select the Source Instance need to configure and search SOFT_DELETE_PREPROCESS parameter as shown below. Click Search.4
  4. Select the SOFT_DELETE_PREPROCESS Parameter Code and you will see all the OOTB Dimensions/ Facts that you can enable soft delete process.
  5. Change the Parameter Value to ‘Yes’ in which Dimensions/ Facts you want to enable Primary Extract and Delete mapping, as shown below:5
  6. By default, OBI Apps include the filter with the DELETE_FLG field (DELETE_FLG=’N’) but it would be good to double check.6

You can check whether the filter is added in the Logical Table Source as screenshot below:

Soft Delete feature for new facts

The above instruction is just used for the pre-build facts in the Data Warehouse. For the new facts you need to see the OOTB Primary Extract and Delete mappings for reference and build the similar custom mappings for these new facts.

Below is the sample sql code generated by the Primary Extract and Delete Mappings process of the main process which populate the Purchase Cost data from EBS source to W_PURCH_COST_F table:

_Primary:

select
TO_CHAR(PO_DISTRIBUTIONS_ALL.PO_DISTRIBUTION_ID)       C1_INTEGRATION_ID
from    BI_ACCNT.PO_DISTRIBUTIONS_ALL   PO_DISTRIBUTIONS_ALL
where    (1=1)
And (PO_DISTRIBUTIONS_ALL.CREATION_DATE>=TO_DATE(SUBSTR(‘#BIAPPS.INITIAL_EXTRACT_DATE’,0,19),’YYYY-MM-DD HH24:MI:SS’))

_IdenfifyDelete:

insert into W_PURCH_COST_F_DEL
(DATASOURCE_NUM_ID, INTEGRATION_ID )
select
T.DATASOURCE_NUM_ID, T.INTEGRATION_ID
from W_PURCH_COST_F  T
left outer join W_PURCH_COST_F_PE  S
on T.DATASOURCE_NUM_ID =S.DATASOURCE_NUM_ID and
T.INTEGRATION_ID =S.INTEGRATION_ID
where S.DATASOURCE_NUM_ID IS NULL
and S.INTEGRATION_ID IS NULL
and T.DELETE_FLG = ‘N’
and T.CREATED_ON_DT > TO_DATE(SUBSTR(‘#BIAPPS.LAST_ARCHIVE_DATE’,0,19),’YYYY-MM-DD HH24:MI:SS’)
and exists
(select 1
from W_PURCH_COST_F_PE DS
where T.DATASOURCE_NUM_ID = DS.DATASOURCE_NUM_ID
)           

_SoftDelete:

update W_PURCH_COST_F  T
set
T.DELETE_FLG = ‘Y’
,T.W_UPDATE_DT = SYSDATE
,T.ETL_PROC_WID = #BIAPPS.ETL_PROC_WID
where (T.DATASOURCE_NUM_ID, T.INTEGRATION_ID) IN
(select D.DATASOURCE_NUM_ID, D.INTEGRATION_ID
from W_PURCH_COST_F_DEL  D
)

After you finish building Primary Extract and Delete mappings, remember that you should apply the filter (DELETE_FLG=’N’) in the Logical Table Source.

ORACLE BIAPPS – TRIMMING RPD

Follow this procedure to extract projects from the full RPD. The end result of this process is a trimmed RPD.

To extract from the RPD the projects for the products you have purchased:

  1. Open a Command window on the computer where the BI Administration Tool is installed.
  2. If you installed Oracle BI EE on Windows, then run bi-init.cmd to launch a Command prompt that is initialized to your Oracle instance. This utility is located in:

<MiddlewareHome>\instances\instance<n>\bifoundation\OracleBIApplication\coreapplication\setup

If you installed the BI Administration Tool using the BI Client installer, then run bi_init.bat to launch a Command prompt that is initialized your Oracle instance. This file is located in:

<Oracle BI Home>\oraclebi\orahome\bifoundation\server\bin

3. In a Command prompt window, run ExtractProjects, as described below:

  • If you installed Oracle BI EE on Windows, ExtractProjects.exe is located in <Oracle Home for BI>\bifoundation\server\bin.
  • If you installed BI Administration Tool using the BI Client installer, ExtractProjects.exe is located in<Oracle BI Home>\oraclebi\orahome\bifoundation\server\bin.

Run one of the following commands:

For extracting a single project:

ExtractProjects -B input_rpd -O output_rpd -I “project_name”

For extracting multiple projects:

ExtractProjects -B input_rpd -O output_rpd -I “project_name1” -I “project_name2”-I “project_name3” (and so on)

where:

input_rpd is the name and path of the full (delivered) release 11.1.1.8.1 RPD and from which you want to extract the project or projects (for example, OracleBIApps.rpd).

output_rpd is the name and path of the RPD you want to create with the extracted projects (for example, OracleBIAppsTrimmed.rpd).

project_name is the name of the RPD project you want to extract.

You will be prompted to enter the encryption password for the RPD (input_rpd ).

The list of projects in the 11.1.1.8.1 RPD includes the following:

  • Financial Analytics Fusion Edition
  • Human Resources Analytics Fusion Edition
  • Marketing Analytics Fusion Edition
  • Partner Analytics Fusion Edition
  • Project Analytics Fusion Edition
  • Sales Analytics Fusion Edition
  • Supply Chain and Order Management Analytics Fusion Edition
  • Student Information Analytics
  • Service Analytics
  • Price Analytics
  • Manufacturing Analytics
  • DataLineage_Project

Note: The RPD contains projects in addition to those listed above. These projects are included for future content delivery and upgrade support. To determine the BI Applications available in this release, see “System Requirements and Supported Platforms,” for Oracle BI Applications release 11.1.1.8.1 at http://www.oracle.com/technetwork/middleware/ias/downloads/fusion-certification-100350.html.

4. Save and rename the trimmed RPD. Make sure the name identifies this RPD as one that has been trimmed, for example, OracleBIAppsTrimmed.rpd.