Connecting Apache Zeppelin to your Oracle Data Warehouse

In my last posts I provided an overview of the Apache Zeppelin open source project which is a new style of application called a “notebook”. These notebook applications typically runs within your browser so as an end user there is no desktop software to download and install.

Interestingly, I had a very quick response to this article asking about how to setup a connection within Zeppelin to an Oracle Database. Therefore, in this post I am going to look at how you can install the Zeppelin server and create a connection to your Oracle data warehouse.

This aim of this post is to walk you through the following topics:
Installing ZeppelinConfiguring ZeppelinWhat is an interpreterFinding and installing the Oracle JDBC driversSetting up a connection to an Oracle PDB Firstly a quick warning! There are a couple of different versions of Zeppelin available for download. At the moment I am primarily using version 0.6.2 which works really well. Currently, for some reason I am seeing performance probl…

Using Apache Zeppelin Notebooks with your Oracle Data Warehouse

Over the past couple of weeks I have been looking at one of the Apache open source projects called Zeppelin. It’s a new style of application called a “notebook” which typically runs within your browser. The idea behind notebook-style applications like Zeppelin is to deliver an adhoc data-discovery tool - at least that is how I see it being used. Like most notebook-style applications, Zeppelin provides a number of useful data-discovery features such as:
a simple way to ingest dataaccess to languages that help with data discovery and data analyticssome basic data visualization toolsa set of collaboration services for sharing notebooks (collections of reports) Zeppelin is essentially a scripting environment for running ordinary SQL statements along with a lot of other languages such as Spark, Python, Hive, R etc. These are controlled by a feature called “interpreters” and there is a list of the latest interpreters available here.

A good example of a notebook-type of application is R Stud…

My query just got faster - brief introduction to 12.2 in-memory cursor duration temp tables

This post covers one of the new SQL performance enhancements that we incorporated into Database 12c Release 2. All of these enhancements are completely automatic, i.e. transparent to the calling app/developer code/query. These features are enabled by default because who doesn’t want their queries running faster with zero code changes?

So in this post I am going to focus on the new In-Memory “cursor duration” temporary table feature. Let’s start by looking at cursor duration temp tables…
Above image courtesy of What is a cursor duration temp table? This is a feature that has been around for quite a long time. Cursor duration temporary tables (CDTs) are used to materialize intermediate results of a query to improve performance or to support complex multi step query execution. The following types of queries commonly use cursor duration temp tables:
WITH Clause and parallel recursive WITHGrouping SetsStar TransformationFrequent Item Set CountingXLATE  What happens during the …

Sneak preview of demo for Oracle Code events

I will be presenting at a number of the Oracle Code events over the coming months on the subject of…..(drum roll please) SQL pattern matching. Oracle Code is a great series of conferences dedicated to developers who want to get the absolute maximum benefit from using today's cutting edge technologies. If you want to register for any of the dates listed below then follow this link to the registration page.

North and Latin America
San Francisco ,  March 1, 2017
Austin ,  March 8, 2017
New York City ,  March 21, 2017
Washington DC ,  March 27, 2017
Toronto ,  April 18, 2017
Atlanta June 22, 2017
Sao Paulo , June 27, 2017
Mexico City ,  June 29, 2017 Europe and Middle East
London , April 20, 2017
Berlin , April 24, 2017
Prague , April 28, 2017
Moscow , May 22, 2017
Brussels , June 6, 2017
Tel Aviv , July 11, 2017 Asia
New Delhi , May 10, 2017
Tokyo , May 18, 2017
Beijing , July 14, 2017
Sydney , July 18, 2017
Seoul , August 30, 2017
Bangalore , August 4, 2017
Back to my session...the ac…

MATCH_RECOGNIZE: Can I use MATCH_NUMBER() as a filter?

Recently I spotted a post on OTN that asked the question: Can MATCH_RECOGNIZE skip out of partition? This requires a bit more detail because it raises all sorts of additional questions. Fortunately the post included more information which went something like this:
after a match is found I would like match_recognize to stop searching - I want at most one match per partition. I don’t want to filter by MATCH_NUMBER() in an outer query - that is too wasteful (or, in some cases, I may know in advance that there is at most one match per partition, and I don’t want match_recognize to waste time searching for more matches which I know don't exist). Can MATCH_RECOGNIZE do this? Short answer is: NO.
Long answer is: Still NO.

Going back to the original question… you could interpret it as asking “is it possible to only return the first match”? The answer to this question is YES, it is possible.

There are a couple of different ways of doing it. Let’s use our good old “TICKER”  data set. The po…

It's out now - Database 12c Release 2 available for download

Database 12c Release 2 available for download Yes, it’s the moment the world has been waiting for: the latest generation of the world’s most popular database, Oracle Database 12c Release 2 (12.2) is now available everywhere - in the Cloud and on-premises. You can download this latest version from the database home page on OTN - click on the Downloads tab.So What’s New in 12.2 for Data Warehousing?This latest release provides some incredible new features for data warehouse and big data. If you attended last year’s OpenWorld event in San Francisco then you probably already know all about the new features that we have added to 12.2 - checkout my blog post from last year for a comprehensive review of #oow16:Blog: The complete review of data warehousing and big data content from Oracle OpenWorld 2016 If you missed OpenWorld and if you are a data warehouse architect, developer or DBA then here are the main feature highlights 12.2 with links to additional content from OpenWorld and my data w…

How to intelligently aggregate approximations

The growth of low-cost storage platforms has allowed many companies to actively seeking out new external data sets and combine them with internal historical data that goes back over a very long time frame. Therefore, as both the type of data and the volume of data continue to grow the challenge for many businesses is how to process this every expanding pool of data and at the same time, make timely decisions based on all the available data.
(Image above courtesy of In previous posts I have discussed whether an approximate answer is just plain wrong and whether approximate answers really are the best way to analyze big data. As with the vast majority of data analysis at some point there is going to be a need to aggregate a data set to get a higher level view across various dimensions. When working with results from approximate queries, dealing with aggregations can get a little complicated because it is not possible to “reuse” an aggregated approximate result as…