BIWA 2016 - here's my list of must-attend sessions and labs
It’s almost here - the 2016 BIWA conference at the Oracle Conference Center. The conference starts on January 26 with a welcome by the conference leaders at 8:30am. The BIWA summit is dedicated to providing all the very latest information and best practices for data warehousing, big data, spatial analytics and BI. This year the conference has expanded to include the most important query language on the planet: SQL. There will be a whole track dedicated to YesSQL! The full agenda is available here
Unfortunately I won’t be able to attend this year’s conference but if I was going to be there, then this would be my list of must-attend sessions and hands-on labs.
Tuesday January 26
What’s New with Spatial and Graph? Technologies to Better Understand Complex Relationships
Time: 10:15 AM - 11:05 AM
Spatial and Graph analysis is about understanding relationships. As applications and infrastructure evolve, as new technologies and platforms emerge, we find new ways to incorporate and exploit social and location information into business and analytic workflows, creating a new class of applications. The emergence of Internet of Things, Cloud services, mobile tracking, social media, and real time systems creates a new landscape for analytic applications and operational systems. It changes the very nature of what we expect from devices and systems. New platforms and capabilities combine to become business as usual.
This session will discuss advances, trends, and directions in the technology, infrastructure, data, and applications affecting the use of spatial and graph analysis in current and emerging analytic and operational systems, and the new offerings Oracle has introduced to address these new developments. This includes new offerings for the Cloud, the NoSQL and Hadoop platform as well as a high-level overview of what is coming in Oracle Spatial and Graph 12.2.
Making SQL Great Again (SQL is Huuuuuuuuuuuuuuuge!)
Time: 11:20 AM - 12:10 PM
SQL is not giving an inch of ground in the fight against NoSQL. Oracle Corporation is fiercely battling on many fronts to reclaim lost ground. Andy Mendelsohn (Executive Vice President for Database Server Technologies), George Lumpkin (Vice President, Product Management), Bryn Llewellyn (Distinguished Product Manager), Steven Feuerstein (Developer Advocate), and Mohamed Zait (Architect) will explain Oracle’s strategy. Oracle ACE Director Kyle Hailey will moderate the discussion.
Is Oracle SQL the best language for Statistics
Time: 11:20 AM - 12:10 PM
Did you know that Oracle comes with over 280+ statistical functions? and that these statistical functions are available in all version of the Database? Most people do not seem to know this. When we hear about people performing statistical analytics we can hear them talking about Excel and R. But what if we could do statistical analysis in the database, without having to extract any data onto client machines. That would be really good and just think of the possible data security issues with using Excel, R and other tools.
This presentation will explore the various statistical areas available in SQL and I will give a number of demonstrations on some of the more commonly used statistical techniques. With the introduction of Big Data SQL we can now use these same 280+ Oracle statistical functions on all our data including Hadoop and NoSQL. We can also greatly expand our statistical capabilities by using Oracle R Enterprise using the capabilities embedded in SQL.
Getting to grips with SQL Pattern Matching
Time: 01:20 PM - 02:10 PM The growth in interest in big data has meant the need to quickly and efficiently analyze very large data sets is now a key part of many projects. This session will explore the new Database 12c SQL Pattern Matching feature. This is a very powerful feature, which can used to solve a wide variety of business and technical problems. Using a live demo we will compare and contrast pre-12c and 12c solutions for typical pattern matching business problems and consider some technical problems that can be solved very efficiently using MATCH_RECOGNIZE.
To help you understand the basic mechanics of the MATCH_RECOGNIZE clause the demo will look at the new MATCH_RECOGNIZE related keywords in the explain plan such as DETERMINISTIC, FINITE and AUTO. We will explore the key concepts of deterministic vs. non-deterministic state machines. Finally, we will review typical issues such as back-tracking.
How to choose between Hadoop, NoSQL or Oracle Database
Time: 01:20 PM - 02:10 PM With today’s hype it is impossible to quickly see what the right tool is for the job. Should you use Hadoop for all your data? How do ensure you scale out a reader farm? When do you use Oracle Database?
This session clears up (most of) the confusion. We will discuss simple criteria – performance, security and cost – to quickly get to the bottom of which technology you should use when. Note that this is not a product session trying to sell a technology; the session goes to the bottom of the technologies, the core of its existence to explain why some types of workloads cannot be solved with some technologies.
Why come to this session:
- Get equipped to cut through the marketing and hype and understand the fundamentals of big data technologies
- Ensure that you do not fall in the trap of wanting the next shiny thing without knowing exactly what you are getting
- Derive an architecture to solve real big data business cases in your organization with the right technologies
Tuesday's Hands-on Labs
If you want to get some up, close and personal time with Oracle Advanced Analytics then make sure you attend Brendan Tierney and Charlie Berger's 2-hour hands on-lab. By the time you have finished the lab you will be a data mining expert!
Learn Predictive Analytics in 2 hours!! Oracle Data Miner 4.0 Hands on Lab
Time: 01:20 PM - 03:30 PM
Learn predictive analytics in 2 hours!! Multiple experts in Oracle Advanced Analytics/Oracle Data Mining will be on hand to help guide you through the basics of predictive analytics using Oracle Advanced Analytics 12c, a Database 12c Option and SQL Developer 4.1's Oracle Data Miner workflow GUI.
Oracle Advanced Analytics embeds powerful data mining algorithms in the SQL kernel of the Oracle Database for problems such as predicting customer behavior, anticipating churn, identifying up-sell and cross-sell, detecting anomalies and potential fraud, market basket analysis, customer profiling, text mining and retail market basket analysis. Oracle Data Miner 4.1, an extension to SQL Developer 4.1, enables business analysts to quickly analyze data and visualize their data, build, evaluate and apply predictive models and develop predictive analytics methodologies—all while keeping the data inside Oracle Database.
Oracle Data Miner's easy-to-use "drag and drop" workflow user interface enables data analysts to quickly build and evaluate predictive analytics models, apply them to new data in Oracle Database and then, most importantly, immediately generate SQL scripts to accelerate enterprise deployment. Come see how easily you can discover big insights from your Oracle data, generate SQL scripts for deployment and automation and deploy results into Oracle Business Intelligence (OBIEE) and other BI dashboards and applications.
Time to relax: at this point you can now relax as day one is almost over. There is just enough time left to enjoy the welcome reception event, sponsored by Deloitte.
Wednesday January 27
The Next Generation of the Oracle Optimizer
Time: 10:05 AM - 10:55 AM
This session introduces the latest enhancements from Oracle to make the statistics in the Oracle Optimizer in the Oracle Database more comprehensive, so you can expect Oracle Optimizer to give you better execution plans than ever before. Learn how Oracle has made it easier to maintain and be assured of comprehensive, accurate, and up-to-date statistics. These improvements allow database applications to get the most out of the enhancements made to Oracle Optimizer. The session uses real-world examples to illustrate these new and enhanced features, helping DBAs and developers understand how to best leverage them in the future and how they can be put to practical use today.
Taking Full Advantage of the PL/SQL Compiler
Time: 01:00 PM - 01:50 PM
The Oracle PL/SQL compiler automatically optimizes your code to run faster; makes available compile-time warnings to help you improve the quality of your code; and offers conditional compilation, which allows you to tell the compiler which code to include in compilation and which to ignore. This session takes a look at what the optimizer does for/to your code, demonstrates compile-time warnings and offers recommendations on how to leverage this feature, and offers examples of conditional compilation so you know what is possible with that feature.
Graph Databases: A Social Network Analysis Use Case
Time: 02:20 PM - 03:10 PM
Graph databases offer a scalable and high performance platform to model, explore, analyze, and link data for a wide range of applications. The first half of this session will provide a broad introduction to graph databases and how they are used to drive social network analysis, IoT, and linked data applications. The various graph technologies available from Oracle for Hadoop, NoSQL, and Database 12c will also be introduced. The second half will demonstrate the use of the Oracle Big Data Spatial and Graph product to perform “sentiment analysis” and “influencer detection” across a social network. In addition, techniques for combining social analysis with location analysis will be demonstrated to better understand how members of a social network are geographically organized and influenced. Finally, session participants will learn some of the advantages and best practices for undertaking social network analysis on a Big Data graph platform.
Fiserv Case Study: Using Oracle Advanced Analytics for Fraud Detection in Online Payments
Time: 03:25 PM - 04:15 PM
Session level: Intermediate Keywords: OAA/Oracle Data Mining, OAA/Oracle R Enterprise, Oracle Database, Oracle R Advanced Analytics for Hadoop Abstract (250 words or less): Fiserv manage risk for $30B+ in transfers, servicing 2,500+ US financial institutions, including 27 of the top 30 banks and prevents $200M in fraud losses every year. When dealing with potential fraud, reaction needs to be fast. Fiserv describes their use of Oracle Advanced Analytics for fraud prevention in online payments and shares their best practices and results from turning predictive models into actionable intelligence and next generation strategies for risk mitigation.
Keynote: Oracle Big Data: Strategy and Roadmap
Time: 05:10 PM - 06:00 PM
The main keynote at this year's conference will be presented by Neil Mendelson, Vice President Big Data Product Management, Oracle.
Implement storage tiering in Data warehouse with Oracle Automatic Data Optimization
Time: 04:30 PM - 05:00 PM
The rapid growth of amount of data in data warehouse posts ever-increasing challenges on performance as well as cost. High performance storage such as flash SSDs can significantly improve data warehouse performance, but It is very costly to store a very large volume of data in SSDs. One effective way to address these challenges is to store the data in different tiers of storage and also compress them based on business and performance needs. For this purpose, Oracle 12c introduced two new features Automatic Data Optimization(ADO) and Heat Map to implement storage tierring and data compression based on the data usage and predefined rules. This session shows how Data warehouse DBAs leverage these features to optimize the database storage for both performance and cost. A few examples will be shown as real life case studies.
Wednesday's Hands-on Labs
There is a whole series of must-attend hands-on labs today so make sure you find time to learn about all the latest greatest features in Oracle Advanced Analytics and Oracle Spatial:
Scaling R to New Heights with Oracle Database
Time: 09:00 AM - 10:55 AM
Oracle R Enterprise (ORE), a component of the Oracle Advanced Analytics Option, provides a comprehensive, database-centric environment for end-to-end analytical processes in R, with immediate deployment to production environments. Entire R scripts can be operationalized in production applications - eliminating the need to port R code. Using ORE with Oracle Database enables R users to transparently analyze and manipulate data in Oracle Database thereby eliminating memory constraints imposed by client R engines. In this hands-on lab, attendees will work with the latest Oracle R Enterprise software, getting experience with the ORE transparency layer, embedded R execution, and the statistics engine. Working with the instructors through scripted scenarios, users will understand how R is used in combination with Oracle Database to analyze large volume data.
Predictive Analytics using SQL and PL/SQL
Time: 11:10 AM - 01:50 PM
This Hands-on-Lab will be suitable, as a follow on, for those people who would have addended the 'Learn Predictive Analytics in 2 hours!! Oracle Data Miner 4.0 Hands on Lab.
As you develop Predictive Modeling skills using Oracle Data Miner, at some stage you will want to explore the in-Database Data Mining algorithms using the native SQL and PL/SQL commands. In this 2-hour Hands-on-Lab you will build upon the Oracle Data Miner hands-on-lab by exploring the existing Oracle Data Mining models, model setttings and parameters. You will be walk through the SQL and PL/SQL code needed to setup and build an Oracle Data Mining model and evaluate the performance of the models. In the final section of this hands-on-lab you will see how you can use these newly created models in Batch and Real-time modes, where you will see how easy it is to build predictive analytics into your front end applications and analytic dashboards.
Applying Spatial Analysis To Big Data
Time: 02:20 PM - 04:15 PM
Location analysis and map visualization are powerful tools to apply to data sources like social media feeds and sensor data, to uncover relationships and valuable insights from big data. Oracle Big Data Spatial and Graph offers a set of analytic services and data models that support Big Data workloads on Apache Hadoop. A geo-enrichment service provides location identifiers to big data, enabling data harmonization based on location. A range of geographic and location analysis functions can be used for categorizing and filtering data. For traditional geo-processing workloads, the ability to perform large-scale operations for cleansing data and preparing imagery, sensor data, and raw input data is provided.
In this lab, you will learn how to use and apply these services to your big data workloads. We will show you how to
- Load common log data like twitter feeds to HDFS, and perform spatial analysis on that data.
- Develop a RecordReader for this sample data so that spatial analysis functions can be applied to records in the data.
- Create and use a spatial index for such data.
- Use RecordReaders for additional data formats like Shape and JSON.
- Use the HTML5 mapping API to create interactive visualization applications on HDFS data.
Learn how developers and data scientists can manage their most challenging graph, spatial, and raster data processing in a single enterprise-class Big Data platform.
Thursday January 28
Best Practices for Getting Started With Oracle Database In-Memory
Time: 09:50 AM - 10:40 AM
Oracle Database In-Memory is one of the most fascinating new database features in a long time, promising an incredible performance boast for analytics. This presentation provides a step-by-step guide on when and where you should take advantage of Database In-Memory, as well as outlining strategies to help you ensure you get the promised performance boast regardless of your database environments. These strategies include how to identify which objects to populate into memory, how to identify which indexes to drop, and details on how to integrate with other performance enhancing features. With easy-to-follow real-world examples, this presentation explains exactly what steps you need to take to get up and running on Oracle Database In-Memory.
Large Scale Machine Learning with Big Data SQL, Hadoop and Spark
Time: 10:55 AM - 11:45 AM
Many Data Scientists and Data Analysts are trying to work at scale with Statistical Analysis tools, having to either learn advanced parallel programming or resorting to different interfaces to work with data in an Oracle Database and a Big Data Cluster.
With the release of the latest Oracle R Advanced Analytics for Hadoop, users have access to the scalability of Big Data into billions of records combined with the fast Machine Learning speeds available through Spark and the Oracle algorithms, without having to leave the intuitive R environment.
Also, with Big Data SQL, the Oracle Advanced Analytics algorithms that run in-Database are able to transparently reach out to data in a Big Data Cluster, exposing the very intuitive Data Miner GUI and the Oracle R Enterprise interfaces to a Data Scientist or Analyst connected to Oracle Database.
In this session, we are going to review two use cases and benchmarks: 1) Predicting Airline flight cancellations using Logistic Regression directly against a Big Data Cluster and 2) Root Cause Analysis of semiconductor manufacturing that uses custom-built R algorithms running against data in the Database and against Hadoop to verify the potential of Big Data SQL.
Analytical SQL in the Era of Big Data
Time: 01:30 PM - 02:20 PM
This session is aimed at developers and DBAs who are working on the next generation of data warehouse ecosystems. Working in these new big data-based ecosystems can be very challenging. What if you could quickly and easily apply sophisticated analytical functions to any data—whether it’s in Hadoop/NoSQL sources or an Oracle Database instance—using a simple, declarative language you already know and love? Welcome back to SQL! This presentation discusses the next generation of SQL with the latest release of Oracle Database and how you can take advantage of its analytical power for any kind of data source, inside or outside of the database. Join this session and be amazed at how much SQL can really do to help you deliver richer big data analytics.
Extreme Data Warehouse Performance with Oracle Exadata
Time: 02:30 PM - 03:20 PM
You've implemented Exadata and are seeing a 3X performance improvement, but you were hoping for more. You've heard the mind-blowing 10X, and sometimes even 100X, improvement stories. This session shows you how to get maximum performance for your data warehouse by having the Exadata "secret sauce" work for you. With a little understanding you can get that 10X+ improvement in your environment. This session will show how to take advantage of the key Exadata specific performance features like smart scans, smart flash cache, storage indexes, HCC and IORM - as well as how to make better use of existing Oracle features like parallelism, partitioning and indexing to make sure you are getting the most out of your Exadata investment.
Attendees of this session will learn why and how their data warehouse can rival the top performing data warehouses in the world using Oracle Exadata and why there is no platform to run an Oracle database faster than Exadata. The session dives into real-world implementation examples of tuning Data Warehouse environments on Exadata at several different customers. The session doesn't just show and explain the Exadata features, but explains how to implement them. It also shows how many other Oracle data warehouse performance features, which are not Exadata specific, should be utilized and implemented on Exadata.
Worst Practice in Data Warehouse Design
Time: 03:40 PM - 04:30 PM
After many years of designing data warehouses and consulting on data warehouse architectures, I have seen a lot of bad design choices by supposedly experienced professional. A sense of professionalism, confidentiality agreements, and some sense of common decency have prevented me from calling people out on some of this. No more! In this session I will walk you through a typical bad design like many I have seen. I will show you what I see when I reverse engineer a supposedly complete design and walk through what is wrong with it and discuss options to correct it. This will be a test of your knowledge of data warehouse best practices by seeing if you can recognize these worst practices.
Thursday's Hands-on Labs
A chance to get up, close and personal with Oracle Database In-Memory
Oracle Database In-Memory Option Boot Camp: Everything You Need to Know
Time: 10:55 AM - 12:30 PM
Oracle Database In-Memory introduces a new in-memory–only columnar format and a new set of SQL execution optimizations such as SIMD processing, column elimination, storage indexes, and in-memory aggregation—all of which are designed specifically for the new columnar format. This hands-on lab provides step-by-step guidance on how to get started with Oracle Database In-Memory and how to identify which of the optimizations are being used and how your SQL statements benefit from them. Experience firsthand just how easy it is to start taking advantage of this technology and the incredible performance improvements it has to offer.
...well, there is my list of sessions and labs that you should definitely add to your personal conference agenda. It really is the perfect place to network with your data warehouse and big data peers and Oracle product management. Have an awesome conference and I hope to see you at next year’s conference.