Monday, 27 January 2014

OTN Developer Day Virtual Conference - Tues Feb 4

otn virtual dvlper day

Make sure you are free on Tuesday February 4 because the OTN team are hosting another of their virtual developer day events. Most importantly it is FREE. Even more importantly is the fact that I will be running a 12c pattern matching workshop at 11:45am Pacific Time. Of course there are lots other sessions that you can attend relating to big data and Oracle Database 12c and the OTN team has created two streams to help you learn about this two important areas:
  • Oracle Database application development — Learn expert tips and tricks on how to develop applications for Oracle Database 12c and Big Data environments more effectively.
  • Oracle Database platform deployment processes — From integration, to data migration, experts showcase new capabilities in Oracle 12c and Big Data environments that will allow you to deliver greater database performance and integration.
You can sign-up for the event and pick your tracks and sessions via this link: https://oracle.6connex.com/portal/database2014/login?langR=en_US&mcc=aceinvite.
My pattern matching session is included in the Oracle 12c DBA section of the application development track and the workshop will cover the following topics:
  • Part 1 - Introduction to SQL Pattern Matching
  • Part 2 - Pattern Match: simple example
  • Part 3 - How to use built-in measures
  • Part 4 - Searching for more complex patterns
  • Part 5 - Deep dive into how SQL Pattern Matching works
  • Part 6 - More Advanced Topics
As my session is only 45 minutes long I am only going to cover the first three topics and leave you to work through the last three topics in your own time. During the 45 minute workshop I will be available to answer any questions via the live Q&A chat feature.
There is a link to the full agenda on the invitation page. The OTN team will be providing a Database 12c Virtualbox VM that you will be able to download later this week. For the pattern matching session I will be providing the scripts to install our sample schema, the slides from the webcast and the workshop files which include a whole series of exercises that will help you learn about pattern matching and test your SQL skills.
The big data team has kindly included my pattern matching content inside their Virtualbox image so if you want to focus on the sessions offered on the big data tracks but still want to work on the pattern matching exercises after the event then you will have everything you need already installed and ready to go!
Don't forget to register as soon as possible and I hope you have a great day…Let me know if you have any questions or comments.

Technorati Tags: , , , , ,

Friday, 24 January 2014

How to create more sophisticated reports with SQL….

SQL Pivot Cube

This is a continuation of a series of posts that cover Oracle's SQL extensions for reporting and analysis. As reviewed in earlier blog posts (Part 1 and Part 2), We have extended SQL's analytical processing capabilities by introducing a family of aggregate and analytic SQL functions. Over the last couple of days I have been exploring how to use some of these SQL features to create tabular-style reports/views. In the past when I worked as a BI Beans product manager I would typically build these types of reports using OLAP cubes and/or Java code. It is been very interesting to work through some of my old report scenarios and transfer my Java aggregation processing back into the database by using SQL to do all the heavy lifting without having to resort to low-level coding.

In my quest to learn about SQL Analytics I started with the idea of creating a simple two dimension cross tabular report as shown below (I am using the SH sample schema if you want to take my code and run it yourself). A typical requirement would be to report total category sales by quarter and compute totals for the full-year (in this case 2001) across all product categories (all the SQL output is shown in scrollable boxes so use your mouse to view the full results):

 

QTR        'Software/Other' 'Hardware' 'Electronics' 'Photo'   'Peripherals and Accessories' 'Cat_Total'
---------- ---------------- ---------- ------------- ---------- ----------------------------- ----------
2001-01           860819.81 1301343.45    1239287.71 1370706.38                    1774940.09 6547097.44
2001-02           872157.38 1557059.59    1144187.90 1563475.51                    1785588.01 6922468.39
2001-03           877630.85 1651454.29    1017536.82 1607315.63                    2042061.04 7195998.63
2001-04           943296.36 1174512.68    1303838.52 1792131.39                    2257118.57 7470897.52
2001-Total        3553904.4 5684370.01    4704850.95 6333628.91                   7859707.71 28136461.98

 

This is typical of the type of the output I used to create to display in reports and dashboards. Of course the formatting could be better but then I would normally sort that out on the client side of the application. Which leaves the question of how to build this type of report? When developing a SQL statement I always try and build it up starting from a simple first stage of just getting the correct dataset. In this case my initial SQL statement was as follows:

SELECT 
t.calendar_quarter_desc,
p.prod_category,
sum(s.amount_sold) as amount_sold
FROM times t,
products p,
sales s
WHERE t.time_id = s.time_id
AND p.prod_id = s.prod_id
AND s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy')
AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
GROUP BY calendar_quarter_desc, prod_category
ORDER BY calendar_quarter_desc, prod_category

which generated the following output

CALENDAR_QUARTER_DESC PROD_CATEGORY                                      AMOUNT_SOLD
--------------------- -------------------------------------------------- -----------
2001-01               Electronics                                         1239287.71
2001-01               Hardware                                            1301343.45
2001-01               Peripherals and Accessories                         1774940.09
2001-01               Photo                                               1370706.38
2001-01               Software/Other                                       860819.81
2001-02               Electronics                                         1144187.90
2001-02               Hardware                                            1557059.59
2001-02               Peripherals and Accessories                         1785588.01
2001-02               Photo                                               1563475.51
2001-02               Software/Other                                      872157.38
2001-03               Electronics                                        1017536.82
2001-03               Hardware                                           1651454.29
2001-03               Peripherals and Accessories                        2042061.04
2001-03               Photo                                              1607315.63
2001-03               Software/Other                                      877630.85
2001-04               Electronics                                        1303838.52
2001-04               Hardware                                           1174512.68
2001-04               Peripherals and Accessories                        2257118.57
2001-04               Photo                                              1792131.39
2001-04               Software/Other                                      943296.36

20 rows selected

 

this is my starting point. My next step is to just pivot the data so that I have the time periods in the first column and then separate columns showing total revenue(amount sold) for each product category showing total revenue (amount_sold). To create the new columns of data containing the revenue for each product category I used the PIVOT clause, as highlighted in bold below:

SELECT * FROM
(SELECT
t.calendar_quarter_desc,
p.prod_category,
s.amount_sold
FROM times t,
products p,
sales s
WHERE t.time_id = s.time_id
AND p.prod_id = s.prod_id
AND s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy')
AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
ORDER BY calendar_quarter_desc)
PIVOT(sum(amount_sold)
FOR prod_category
IN ('Software/Other', 'Hardware', 'Electronics', 'Photo',
'Peripherals and Accessories'))
ORDER BY calendar_quarter_desc;

Notice the first 'SELECT * FROM' statement. This approach/framework is the way we implement some of the SQL analytical features, such as the 12c MATCH_RECOGNIZE clause. This first SQL statement allows us to place another layer of analysis, in this case the pivot operation, over our result set that is derived from the main SQL statement. The above SQL generated the following output:

 

calendar_quarter_desc 'Software/Other' 'Hardware' 'Electronics' 'Photo' 'Peripherals and Accessories'
--------------------- ---------------- ---------- ------------- ---------- -----------------------------
2001-01                      860819.81 1301343.45 1239287.71    1370706.38                    1774940.09
2001-02                      872157.38 1557059.59 1144187.90    1563475.51                    1785588.01
2001-03                      877630.85 1651454.29 1017536.82    1607315.63                    2042061.04
2001-04                      943296.36 1174512.68 1303838.52    1792131.39                    2257118.57

 

You will notice that the PIVOT clause has three components: there is the PIVOT key word which is the main container clause. This is then followed by the aggregation operation - in this case sum(amount_sold) - then the FOR element identifies the dimension or column that is going to be pivoted. Finally the IN element lists the dimension members that will form the individual columns. The way the IN clause is structured does seem to impose an unnecessary restriction in that you have to list the actual dimension members. It is not possible to have a dynamic list of members derived from say a SELECT DISTINCT… FROM statement. If you can overlook this limitation then the PIVOT clause is very powerful.

Now we have our basic cross tabular output, it would be nice if we could add a subtotal for the quarters which would give us the annual sales revenue for each product category. Fortunately, because we are using standard Oracle SQL syntax we can easily introduce other SQL analytic operations into our statement. Using the CUBE() feature we can create annual subtotals as follows:

SELECT * FROM
(SELECT
t.calendar_quarter_desc,
p.prod_category,
sum(s.amount_sold) as amount_sold
FROM times t,
products p,
sales s
WHERE t.time_id = s.time_id
AND p.prod_id = s.prod_id
AND s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy')
AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
GROUP BY CUBE(calendar_quarter_desc), prod_category
ORDER BY calendar_quarter_desc)
PIVOT(sum(amount_sold)
FOR prod_category
IN ('Software/Other', 'Hardware', 'Electronics',
'Photo', 'Peripherals and Accessories'))
ORDER BY calendar_quarter_desc;

this now generates our subtotal, or annual total, for all quarters:

CALENDAR_QUARTER_DESC 'Software/Other' 'Hardware' 'Electronics' 'Photo' 'Peripherals and Accessories'
--------------------- ---------------- ---------- ------------- ---------- -----------------------------
2001-01                      860819.81 1301343.45    1239287.71 1370706.38                    1774940.09
2001-02                      872157.38 1557059.59    1144187.90 1563475.51                    1785588.01
2001-03                      877630.85 1651454.29    1017536.82 1607315.63                    2042061.04
2001-04                      943296.36 1174512.68    1303838.52 1792131.39                    2257118.57
                            3553904.40 5684370.01    4704850.95 6333628.91                    7859707.71
 

 

and you will notice that the last line has a null value for the quarter description. It would be nice if we could add a label to describe this row, something like '2001-Total'. The CUBE command does allows us to track which rows are data rows and which rows have been calculated and this is achieved using the GROUPING statement. We can wrap a DECODE statement around the t.calendar_quarter_desc line and test each row to see if it is a data point (where the GROUPING function will return "0") or a total (where the GROUPING function will return "1") as follows:

SELECT * FROM
(SELECT
DECODE(GROUPING(t.calendar_quarter_desc)
, 0, t.calendar_quarter_desc
, 1, '2001-Total') as calendar_quarter_desc,

p.prod_category,
sum(s.amount_sold) as amount_sold
FROM times t,
products p,
sales s
WHERE t.time_id = s.time_id
AND p.prod_id = s.prod_id
AND s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy')
AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
GROUP BY CUBE(calendar_quarter_desc), prod_category
ORDER BY calendar_quarter_desc)
PIVOT(sum(amount_sold)
FOR prod_category
IN ('Software/Other', 'Hardware', 'Electronics',
'Photo', 'Peripherals and Accessories'))
ORDER BY calendar_quarter_desc;

 

now our null total line shows the value "2001-Total":

CALENDAR_QUARTER_DESC 'Software/Other' 'Hardware' 'Electronics' 'Photo' 'Peripherals and Accessories'
--------------------- ---------------- ---------- ------------- ---------- -----------------------------
2001-01                      860819.81 1301343.45 1239287.71    1370706.38                    1774940.09
2001-02                      872157.38 1557059.59 1144187.90    1563475.51                    1785588.01
2001-03                      877630.85 1651454.29 1017536.82    1607315.63                    2042061.04
2001-04                      943296.36 1174512.68 1303838.52    1792131.39                    2257118.57
2001-Total                  3553904.40 5684370.01 4704850.95    6333628.91                    7859707.71

 

As the CUBE function makes it easy to add totals as part of our report we can repeat this process for the product category dimension so our CUBE clause looks like this,:

GROUP BY CUBE(calendar_quarter_desc, prod_category)

this will now add column totals to our report and a grand total in the bottom right corner of our grid of data. To create the cross tabular report totals for both quarters and product categories we simply extend the CUBE statement to include the product_category and tidy up some of the column labels:

SELECT * FROM
(SELECT
DECODE(GROUPING(t.calendar_quarter_desc)
, 0, t.calendar_quarter_desc
, 1, '2001-Total') as "Time",
DECODE(GROUPING(p.prod_category)
, 0, p.prod_category
, 1, 'Cat_Total') AS prod_category,

sum(s.amount_sold) as amount_sold
FROM times t,
products p,
sales s
WHERE t.time_id = s.time_id
AND p.prod_id = s.prod_id
AND s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy')
AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
GROUP BY CUBE(calendar_quarter_desc, prod_category)
ORDER BY calendar_quarter_desc)
PIVOT(sum(amount_sold)
FOR prod_category
IN ('Software/Other', 'Hardware', 'Electronics',
'Photo', 'Peripherals and Accessories', 'Cat_Total'))
ORDER BY "Time";

The key highlights here are the second DECODE() function which evaluates the prod_category data to determine if the row is a data point or a total line, we have extended the CUBE() function to include the prod_category which will generate the relevant subtotals and finally we have extended the list of categories included in the IN part of the PIVOT clause to pickup our report total, 'Cat_Total'. Now the output looks like this:

Time       'Software/Other' 'Hardware' 'Electronics' 'Photo'    'Peripherals and Accessories''Cat_Total'
---------- ---------------- ---------- ------------- ---------- ----------------------------- ----------
2001-01           860819.81 1301343.45    1239287.71 1370706.38                    1774940.09 6547097.44
2001-02           872157.38 1557059.59    1144187.90 1563475.51                    1785588.01 6922468.39
2001-03           877630.85 1651454.29    1017536.82 1607315.63                    2042061.04 7195998.63
2001-04           943296.36 1174512.68    1303838.52 1792131.39                    2257118.57 7470897.52
2001-Total       3553904.40 5684370.01    4704850.95 6333628.91                   7859707.71 28136461.98

 

As I started to expand this little code sample to include more data columns I ran into a small problem….Let's say we want to know how many orders were placed as well as the amount of revenue for each product category. To do that we need to add an additional column of data which means that we need to expand our PIVOT clause. In the code below I have added a COUNT() function to count the number of transactions (and I am making the assumption that each row in the fact table equates to a single transaction) and then included this column in the PIVOT clause:

 

SELECT * FROM
(SELECT
DECODE(GROUPING(t.calendar_quarter_desc)
, 0, t.calendar_quarter_desc
, 1, '2001-Total') as "Time",
DECODE(GROUPING(p.prod_category)
, 0, p.prod_category
, 1, 'Cat_Total') AS prod_category,
sum(s.amount_sold) as amount_sold,
count(s.amount_sold) as no_of_trans
FROM times t,
products p,
sales s
WHERE t.time_id = s.time_id
AND p.prod_id = s.prod_id
AND s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy')
AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
GROUP BY CUBE(calendar_quarter_desc, prod_category)
ORDER BY calendar_quarter_desc)
PIVOT(sum(amount_sold) as Revenue,
sum(no_of_trans) as Transactions
FOR prod_category
IN ('Software/Other', 'Hardware', 'Electronics',
'Photo', 'Peripherals and Accessories', 'Cat_Total'))
ORDER BY "Time";

Notice that I have added a column alias for each calculated measure in the PIVOT clause as this is best practice and is recommended in the SQL Reference manual, see page 1635) states that "You can optionally provide an alias for each pivot column valueIf there is no alias, the column heading becomes a quoted identifier". See this extract from the 12c documentation: http://docs.oracle.com/cd/E16655_01/server.121/e17209/statements_10002.htm#SQLRF55233

 

However, even adding the column aliases we still generate the following error message:

“ORA-00918 Column ambiguously defined”

the above code is actually correct in terms of its syntax. As we followed the recommended guidance in the documentation does this code generate an ORA-00918 error? 

The reason why the above error message is generated is because we have exceeded the maximum length of column name. During the PIVOT operation we have to rename the columns so that the first pivoted column becomes our first product category,"Software/Other", + the name of our first data column, which is "Revenue".  The second column in our pivot table  becomes "Software/Other" + the name of our second data column, which is "Transactions". This means that the column names can get very long and in fact they get so long that when we get to the last product category in our list , 'Peripherals and Accessories', the automatic naming process generates a column name that is just too long and we get an error message - "ORA-00918  Column ambiguously defined”. This is a slightly misleading because the actual problem is that the column name, which is automatically generated, is too long.

If we remove the 'Peripherals and Accessories' product category from the in-list that is part of the PIVOT clause then the code will work: 

SELECT * FROM
(SELECT
DECODE(GROUPING(t.calendar_quarter_desc)
, 0, t.calendar_quarter_desc
, 1, '2001-Total') as "Time",
DECODE(GROUPING(p.prod_category)
, 0, p.prod_category
, 1, 'Cat_Total') AS prod_category,
sum(s.amount_sold) as amount_sold,
count(s.amount_sold) as no_of_trans
FROM times t,
products p,
sales s
WHERE t.time_id = s.time_id
AND p.prod_id = s.prod_id
AND s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy')
AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
GROUP BY CUBE(calendar_quarter_desc, prod_category)
ORDER BY calendar_quarter_desc)
PIVOT(sum(amount_sold) as Revenue,
sum(no_of_trans) as Transactions
FOR prod_category
IN ('Software/Other', 'Hardware', 'Electronics', 'Photo', 'Cat_Total'))
ORDER BY "Time";

and this will now generate the following output:

 

Time 'Software/Other'_REVENUE 'Software/Other'_TRANSACTIONS 'Hardware'_REVENUE 'Hardware'_TRANSACTIONS 'Cat_Total'_REVENUE 'Cat_Total'_TRANSACTIONS
---------- ------------------------ ----------------------------- ------------------ ----------------------- ------------------- ------------------------
2001-01 860819.81 24985 1301343.45 1029 6547097.44 60608
2001-02 872157.38 25469 1557059.59 1281 6922468.39 63292
2001-03 877630.85 25904 1651454.29 1379 7195998.63 65769
2001-04 943296.36 26217 1174512.68 1057 7470897.52 69749
2001-Total 3553904.40 102575 5684370.01 4746 28136461.98 259418

 

 

of course this is not really a workable solution since we have lost a column of data! What is missing from the code (and from the documentation - which will be fixed as soon as possible) is the inclusion of aliases for each of our product category names in the IN() clause. Therefore, if we amend the IN clause as follows:

IN ('Software/Other' as "S", 'Hardware' as "H", 'Electronics' as "E",
'Photo' as "P",  'Peripherals and Accessories' as "P-A", 'Cat_Total'))

then the code would look like this:

SELECT * FROM
(SELECT
DECODE(GROUPING(t.calendar_quarter_desc)
, 0, t.calendar_quarter_desc
, 1, '2001-Total') as "Time",
   DECODE(GROUPING(p.prod_category) 
, 0, p.prod_category
, 1, 'Cat_Total') AS prod_category,
sum(s.amount_sold) as amount_sold,
count(s.amount_sold) as no_of_trans
FROM times t,
products p,
sales s
WHERE t.time_id = s.time_id
AND p.prod_id = s.prod_id
AND s.time_id BETWEEN TO_DATE('01-JAN-2001','dd-MON-yyyy')
AND TO_DATE('31-DEC-2001','dd-MON-yyyy')
GROUP BY CUBE(calendar_quarter_desc, prod_category)
ORDER BY calendar_quarter_desc)
PIVOT(sum(amount_sold) as Revenue,
sum(no_of_trans) as Transactions
FOR prod_category
IN ('Software/Other' as "S", 'Hardware' as "H", 'Electronics' as "E",
'Photo' as "P", 'Peripherals and Accessories' as "P-A", 'Cat_Total'))
ORDER BY "Time";
 

and would generate the following output: 

Time       S_REVENUE  S_TRANSACTIONS H_REVENUE  H_TRANSACTIONS E_REVENUE  E_TRANSACTIONS P_REVENUE  P_TRANSACTIONS P-A_REVENUE P-A_TRANSACTIONS 'Cat_Total'_REVENUE 'Cat_Total'_TRANSACTIONS
---------- ---------- -------------- ---------- -------------- ---------- -------------- ---------- -------------- ----------- ---------------- ------------------- ------------------------
2001-01     860819.81          24985 1301343.45           1029 1239287.71          10063 1370706.38           8004  1774940.09            16527          6547097.44                   60608
2001-02     872157.38          25469 1557059.59           1281 1144187.90          12074 1563475.51           8408  1785588.01            16060          6922468.39                   63292
2001-03     877630.85          25904 1651454.29           1379 1017536.82          11152 1607315.63           8651  2042061.04            18683          7195998.63                   65769
2001-04     943296.36          26217 1174512.68           1057 1303838.52          12859 1792131.39           9724  2257118.57            19892          7470897.52                   69749
2001-Total 3553904.40         102575 5684370.01           4746 4704850.95          46148 6333628.91          34787  7859707.71            71162         28136461.98                  259418
 

 

Now I can see how much revenue each category generated each quarter and for the full year along with the number of orders that were placed. Very nice and very simple!

Apart form finding an obscure but interesting SQL bug, what have I learned here? What is amazing here, at least for me, is the power of SQL analytics. Using features such as CUBE and PIVOT you can push a lot of application processing back inside the database and make your application code a lot simpler. You don't have to create temporary tables or pull data into the middle-tier server to then walk-through the data points to calculate row and column totals. SQL analytics will manage all this for you. Your application code will be simpler and it is very likely that the performance of your system will improve as well.

I hope this is useful and if you have any questions then let me know. 

 

Technorati Tags: , ,

Friday, 17 January 2014

StubHub's Data Scientists reap benefits of integrated approach….

We have released yet another great video customer video, this time with StubHub.

StubHub provides the world's largest fan-to-fan ticket marketplace. The company was formed in 2000 and now dominates the market by making sure fans have a truly open marketplace where they can buy or sell tickets without restrictions or limitations. For more information about SubHub and its services visit their website: http://www.stubhub.com

StubHub's is now getting real business benefit from moving their data analytics inside their data warehouse. This seems like an obvious way to build your data warehouse but many customers are still pulling data out of their data warehouse and shipping it to specialised processing engines so they can mine their data, run spatial analytics and/or built multi-dimensional cubes. The problem with this approach, as the team at StubHub points out, is that typically when you move the data to these specialised engines you have to work with a subset of the data that is sitting in your data warehouse. When you work with a subset of data you immediately start to impose compromises on your analytical workflows. If you can't work with all your data then you can't be sure that your analytical model is as good as it could be and that could mean losing customers or missing out on additional revenue.

The other problem comes from everyone using their own favourite tool to do their analysis: how do you share your discoveries, how do you develop a high level of corporate-wide analytical skills?

Stubhub-Video

Click on the above image to watch the video

StubHub asked Oracle to help them resolve these two key problems. Their data scientists have now adopted open source R as their standard tool for data mining projects. By adopting Oracle's R Enterprise solution they have been able was to push all their analytical processing that is linked to their data mining workflows back inside their Oracle Data Warehouse. Oracle Advanced Analytics is optimised to provide real-time analytics that delivers insight into key business subjects such as product recommendations, and fraud alerting. Most importantly because the data mining workflows run inside the Oracle Database the data scientists have access to all the data inside their data warehouse and no longer have to rely on extracting small subsets of data, the data remains completely secure and they can benefit from the built-in operational scalability of the Oracle Database.

To watch the video you can click on the image above or click here to go directly to the Oracle MediaNetwork page.

For more information about Oracle's R Enterprise solution use the following link: http://www.oracle.com/us/products/database/options/advanced-analytics/overview/index.html

 

Technorati Tags: , , ,

Monday, 6 January 2014

CaixaBank deploys new big data infrastructure on Oracle

CaixaBank is Spain’s largest domestic bank by market share with a customer base of 13.7 It is also Spain’s leading bank in terms of innovation and technology, and one of the most prominent innovators worldwide. CaixaBank has been recently awarded the title of the World’s Most Innovative Bank at the 2013 Global Banking Innovation Awards (November 2013).

Like most financial services companies CaixaBank wants to get closer to its customers by collecting data about their activities across all the different channels (offices, internet, phone banking, ATMs, etc.). In the old days we used to call this CRM and then this morphed into "360-degree view" etc etc. While many companies have delivered these types of projects and customers feel much more connected and in control of their relationship with their bank the capture of streams of big data has the potential to create another revolution in the way we interact with our bank. What banks like CaixaBank want to do is to capture data in one part of the business and make it available to all the other lines of business as quickly as possible.

Big data is allowing businesses like  CaixaBank to significantly enhance the business value of their existing customer data by integrating it with all sorts of other internal and external data sets. This is probably the most exciting part of big data because the potential business benefits are really only constrained by imagination of the team working on these type of projects. However, that in itself does create problems in terms of securing funding and ownership of projects because the benefits can be difficult to estimate which is where all the industry use cases, conference papers and blog posts can help in terms of providing insight into what is going on in across the market in broad general terms.

To help them implement a strategic Big Data project, CaixaBank has selected Oracle for the deployment of its new Big Data infrastructure. This project, which includes an array of Oracle solutions, positions CaixaBank at the forefront of innovation in the banking industry. The new infrastructure will allow CaixaBank to maximize the business value from any kind of data and embark on new business innovation projects based on valuable information gathered from large data sets. Projects currently under review include:

  • Development of predictive models to improve customer value
  • Identifying cross-selling and up-selling  opportunities
  • Development of personalized offers to customers
  • Reinforcement of risk management and brand protection services
  • More powerful fraud analysis 
  • General streamlining of current processes to reduce time-to-market
  • Support new regulatory requirements

The Oracle solution (including Oracle Engineered Systems, Oracle Software and Oracle Consulting Services) consists in the implementation of a new Information Management Architecture that provides a unified corporate data model and new advanced analytic capabilities (for more information about how Oracle's Reference Architecture can help you integrate structured, semi-structured and unstructured information into a single logical information resource that can be exploited for commercial gain click here to view our whitepaper)

The importance of the project is best explained by Juan Maria Nin, CEO of CaixaBank:

“Business innovation is the key for success in today’s highly competitive banking environment. The implementation of this Big Data solution will help CaixaBank remain at the forefront of innovation in the financial sector, delivering the best and most competitive services to our customers”.

The Oracle press release is here: https://emeapressoffice.oracle.com/Press-Releases/CaixaBank-Selects-Oracle-for-the-Deployment-of-its-New-Big-Data-Infrastructure-4183.aspx

 

 

Technorati Tags: , , ,