MJR Analytics Sessions at UKOUG Tech'18, Liverpool ACC, 3rd-5th December 2018
ukoug3.png

Oracle Openworld may have San Francisco and the Moscone Center but you can’t beat Liverpool and the wind whipping in-off the Mersey in early December, and so we’re very pleased to be presenting once again at the UK Oracle User Group Technology Conference & Exhibition running next week at the Liverpool ACC, December 3rd-5th 2018.

Mark Rittman is presenting two sessions on Oracle’s new autonomous analytics and data warehousing platforms:

  • “Planning a Strategy for Autonomous Analytics & Data Warehouse Cloud”, Monday 3rd December 2:30 PM - 3:15 PM in Room 13

    ”As Oracle Analytics and Data Warehousing becomes self-driving and autonomous, the need for a strategy within your BI function becomes all the more important. How you deliver BI content to your users, the skills your developers now need and the most efficient way to manage your cloud estate are vital components of an autonomous cloud analytics strategy; this session will explain what’s changed, what’s significant and what are the implications of that change.

    By reference to today’s Oracle BI technology and applications and explaining how they now run autonomously as services in Oracle Cloud we’ll cover what are the key changes and its implications, what this means in terms of budgets and planning your investment, what are the key technology and process changes, what this means for team resourcing and roles - and just as importantly, what can be safely ignored for now.”

  • “BI Developer to Data Engineer with Oracle Analytics Cloud Data Lake Edition”, Tuesday 4th December 12:35 PM - 1:20 PM Room 11A

    ”In this session, we'll look at the role of the data engineer in designing, provisioning, and enabling an Oracle Cloud data lake using Oracle Analytics Cloud Data Lake Edition. We’ll also examine the use of data flow and data pipeline authoring tools and how machine learning and AI can be applied to this task. Furthermore, we’ll explore connecting to database and SaaS sources along with sources of external data via Oracle Data-as-a-Service. Finally we’ll delve into how traditional Oracle Analytics developers can transition their skills into this role and start working as data engineers on Oracle Public Cloud data lake projects.”

Mark Rittman will be at the event all week, so if you’ve got any questions about the topics we’re talking about in the two sessions or any aspect of Oracle Analytics, big data or Autonomous Data Warehouse either drop us an email beforehand, give us a call on +44 7866 568246 or just stop us in the corridor and we’ll be pleased to chat then.

Five Thoughts About Thomas Kurian’s Move to become CEO of Google Cloud Platform

The news broke late on Friday that Thomas Kurian is joining Google to become their new CEO for Google Cloud Platform. Five thoughts on Kurian’s move:

  1. It’s a smart move made by an exceptionally smart guy. Brave, as well, given the recent history between Oracle and Google but also not surprising given his drive and presumably point to prove. I met him a few times as part of Oracle’s ACE Director program and he’s the only software exec I know who can talk long-term strategy and vision one minute and then explain to you in-detail how it all works, and doing it all with great humility and courtesy.

  2. The fact that GCP is spoken-of as an also-ran at 10% market share whilst Oracle Cloud gets bundled in with “Next 10” shows what a mountain Oracle have to climb to even become a contender to compete with Microsoft and Amazon in the cloud business - and their insistence on only allowing their SaaS and PaaS apps to run in Oracle Cloud is a worrying parallel with the “Windows Tax” that Microsoft’s Office and Server products teams had to pay back in the Steve Ballmer Era, but with Oracle’s equivalent to Satya Nadella having lost the argument and jumped-ship

  3. But Oracle will survive and this has happened many times before - Ray Lane, Charles Philips, Tom Seibel, Mark Benioff all left and in many cases founded massively successful and competitive businesses, client-server went to internet architecture and then internet went to cloud, it’s all part of how businesses renew and people move on and up, and there’s plenty more smart (and most likely, smarter) people left in Oracle and Larry Ellison is still just as driven, competitive and on-top of his game.

  4. Look out for a very interesting (as in Brexit, interesting to watch but not to be part of) culture clash at GCP, with TK about the most top-down leader of a product development team I’ve ever seen and Google, famously, engineering-focused beanbag-friendly and bottom-up. Add to that the feelings most Googler’s have towards Oracle and TK will have his work cut-out early on to win the various teams over - my guess is that his work-ethic, technical chops and sheer ability will do just that and if this role is a success, Sergey and Larry may well have found their new Eric Schmidt but this time with technical skills to match theirs - but there’s always the chance that culture will prevail and he’ll be the next Marissa Meyer instead. No pressure there then..

  5. Expect to see GCP moving increasingly into SaaS B2B areas such as ERP, CRM, Financials and industry-vertical applications to complement their commoditised IaaS and PaaS cloud business and leveraging their massive GSuite and increasingly ChromeOS install base. Just think what they could do if they had access to all the world’s structured business transactional data as well as the unstructured search, email and spreadsheet data they have now - even more comprehensive and market-leading search, the world’s biggest and most valuable ML training set, and a business model that could provide these applications for free in exchange for exclusive access to that data and Google being your default search engine. That’s the real existential threat to Oracle; spending all their time trying to win an un-winnable cloud infrastructure war and then GCP coming along and making ERP, CRM and business applications essentially free.

Slides and Forbes.com Article from "Data Warehouse Like a Tech Startup with Oracle Autonomous Data Warehouse Cloud"

One of the sessions we delivered at Oracle Openworld 2018 was “Data Warehouse Like a Tech Startup with Oracle Autonomous Data Warehouse Cloud”, a presentation on the agility that Oracle Autonomous Data Warehouse Cloud (ADWC) can provide for teams within big, traditional businesses that would otherwise have to wait months to provision a physical Oracle data warehousing database. Moreover, ADWC instances can scale-up and scale-down as demand grows over time, and hits peaks such as Black Friday that are only short-term but in the past would have required companies to pay for and provision that level capacity all year round even though it sat idle most of the time.

The story behind the presentation was also written-up as a profile article in Forbes magazine by Oracle’s Jeff Erickson, with a great description of how MJR Analytics works on client projects.

To understand the type of agility Rittman is talking about, look at how he works with a client on a data analytics question: “You want to help them get inside their data to see trends and patterns” that they couldn’t otherwise see, he says. That might mean quickly bringing in public data or data from a data-as-a-service company and connecting outside analytics tools. “I work fast,” he says. With a retail customer, for example, “we’ll look to understand the lifetime value of a customer, their cycle of purchases, repeat purchases—all kinds of scenarios. And the whole thing will happen in a day. Much of it collaboratively with the client on a web conference.
— Jeff Erickson, "How A Big Business Can Use An Autonomous Database To Move Like A Startup"

But don’t just take our word for it, or even Forbes Magazine - our objective is for every client engagement to be referenceable where client confidentially allows, and you can check-out our first two testimonials from Colourpop and from Florence on our new Customers page … with a couple more to be added in the next week or so!

Event-Level Digital Analytics using Google Analytics, Fivetran, BigQuery and Looker

A few weeks ago I posted a blog on using Mixpanel, Looker and Google BigQuery to analyze listener data from the Drill to Detail website, with Mixpanel tracking individual episode play events by visitors to the site and Fivetran replicating that event data over to BigQuery for analysis using Looker.

Mixpanel is great but like most digital businesses using Google Analytics (GA) to analyze and measure activity on their website, we actually use Google Tag Manager and GA events to track visitor activity on the mjr-analytics.com website. Whilst a full site license for Google Analytics 360 is way beyond our budget right now, the free version of GA together with GTM provides us with an unsampled, event-level stream of visitor events and together with goals and simple A/B testing tools a way to start optimizing the content on our site. 

Couple all of this with Fivetran and Looker and we can start to do some interesting analytics on all this data to not only count sessions, visitors and pages viewed but also start looking at visitor retention and the routes those visitors take through the various features on our website.

Events in Google Analytics are a way of recording interactions visitors to your site make with items on a page such as selecting an item on a navigation menu, clicking on a download link or seeing a special offer or product recommendation. On the mjr-analytics.com website we’ve setup events, for example, to record how far down the page a visitor scrolls so we can see how many people actually scroll past the hero image on our front page and look at the content on Looker services, Oracle services and so on.

Untitled 2.001.png

Another event records clicks on the specific hero image a visitor clicks on with another recording what navigation menu item was clicked on, in each case storing the image name or menu item clicked on as the event label.

Untitled 2.002.png

Whilst you can record events being triggered by adding custom Javascript to your website’s HTML page definitions, we use GTM’s graphical point-and-click interface to set-up our events and the visitor actions that trigger them.

Untitled 2.005.png

Enabling GTM’s debug mode for our site shows us which events then trigger when visiting our site, with the screenshot below showing a typical visitor journey through our homepage and the events that are triggered as they scroll-through and interact with various items on the page.

Untitled 2.003.png

Data on events being triggered along with event label and event action values are stored in a Javascript data layer that can also receive values sent over, for example, by an eCommerce application so that clicks on a product catalog item can record the price on offer as well as what product was clicked on. 

Looking at the Data Layer view in Google Tag Manager you can see the event entries recorded for this visitor to the site page, but note also how nothing personally identifiable is recorded except for a unique client ID set at the device/browser level that can then be used to analyze new vs. returning visitors and track visitor retention and engagement over time.

Untitled 2.004.png

As we then did with the Mixpanel example and the Drill to Detail website, we then use Fivetran to replicate all of the data layer events over to Google BigQuery, storing this granular visitor behavioural data in a BigQuery table and dataset. As you can see from the screenshot below we use Fivetran extensively within MJR Analytics; replicating invoices and all our other accounting data from Xero into BigQuery along with timesheet data from Harvest, CRM data from Hubspot, commits and PRs from Github, tasks outstanding and completed from Asana and search rankings and keywords from Google Search Console all into a Google BigQuery project whose datasets we then combine and analyze together using Looker.

Untitled 2.006.png

For the event data coming across from GA we use Fivetran to replicate the relevant event tracking dimensions and metrics into BigQuery every five minutes, giving us near real-time numbers on what’s popular and what’s engaging on our company website.

Untitled 2.007.png

As the event data from GA arrives as table rows of individual events being triggered, I then sessionize those individual events into rows of visitor sessions, using BigQuery SQL to pivot the incoming data and create individual columns for the first event, second event, first page view and first product item clicked on, for example, within an individual visitor session. The actual SQL used for the view is too long really to include here but to get the idea, here’s a condensed version with repeating parts removed for clarity.

SELECT
  global_session_id,
  visitor_session_id,
  visitor_id,
  visitor_session_start_date_time,
  MAX(visitor_session_length_mins) AS visitor_session_length_mins,
  visitor_first_event_week_cohort,
  city,
  session_events_count,
  ...
  event_1_category,
  event_1_label,
  ...
  page_view_1,
  page_view_2,
  ...
FROM (
  SELECT
    e.global_session_id,
    e.visitor_session_id,
    e.visitor_id,
    e.visitor_session_start_date_time,
    e.visitor_session_length_mins,
    e.visitor_first_event_week_cohort,
    e.city,
    MAX(e.visitor_session_event_seq_num) OVER (PARTITION BY e.global_session_id) AS session_events_count,
    (
    SELECT
      COUNT(*)
    FROM
      `aerial-vehicle-148023.dw_staging.ga_events`
    WHERE
      global_session_id = e.global_session_id
      AND event_category = 'Page View') AS page_view_count,
    SELECT
      COUNT(*)
    FROM
      `aerial-vehicle-148023.dw_staging.ga_events`
    WHERE
      global_session_id = e.global_session_id
      AND event_category = 'Nav Bar Clicks') AS nav_bar_click_count,
    ...,
    e1.event_label AS event_1_label,
    e2.event_category AS event_2_category,
    e2.event_label AS event_2_label,
    ...
    pv.event_label AS page_view_1,
    pv2.event_label AS page_view_2,
    ...,
    igc.event_label AS index_gallery_first_click
  FROM
    `aerial-vehicle-148023.dw_staging.ga_events` e
  LEFT OUTER JOIN (
    SELECT
      global_session_id,
      event_label,
      ROW_NUMBER() OVER (PARTITION BY global_session_id ORDER BY visitor_session_event_seq_num) page_view_seq_num
    FROM
      `aerial-vehicle-148023.dw_staging.ga_events`
    WHERE
      event_category = 'Page View'
    ORDER BY
      3) pv
  ON
    e.global_session_id = pv.global_session_id
    AND pv.page_view_seq_num = 1
  LEFT OUTER JOIN (
    SELECT
      global_session_id,
      event_label,
      ROW_NUMBER() OVER (PARTITION BY global_session_id ORDER BY visitor_session_event_seq_num) page_view_seq_num
    FROM
      `aerial-vehicle-148023.dw_staging.ga_events`
    WHERE
      event_category = 'Page View'
    ORDER BY
      3) pv2
  ON
    e.global_session_id = pv2.global_session_id
    AND pv2.page_view_seq_num = 2
  ...
  JOIN
    `aerial-vehicle-148023.dw_staging.ga_events` e1
  ON
    e.global_session_id = e1.global_session_id
    AND e1.visitor_event_seq_num = 1
  LEFT OUTER JOIN
    `aerial-vehicle-148023.dw_staging.ga_events` e2
  ON
    e.global_session_id = e2.global_session_id
    AND e2.visitor_event_seq_num = 2
  ...
GROUP BY
  1,
  2,
  ...,
  46
ORDER BY
  1

Creating a view or materialized table from this BigQuery SQL then gives me one record per session with event totals, page view and event paths along with session and client IDs that are perfect then for bringing into Looker for further analysis.

Untitled 2.008.png

Then finally it’s just a case of bringing that table of data into Looker as a LookML view and using one of Looker’s custom visualization blocks to display visitor event paths as a Sankey diagram, for example.

Untitled 2.009.png

Or we can aggregate and analyze counts of events from sessions to see in graphical form which offers, download items and navigation menu items are most popular over a given period.

Untitled 2.010.png

If you’d like to understand more about how MJR Analytics can help you better understand what visitors are looking at and interacting with on your website then drop me an email at mark.rittman@mjr-analytics.com; or even better, if you’re in Copenhagen on November 22nd and have the evening free, come along to the free meetup we’re hosting together with our partners CIMA and Looker and you’ll see me demo this all live, in person and show you how it works.

Digital Analytics, BI and Big Data Meetup in Copenhagen 22.11.2018

If you’re in Copenhagen on 22nd November 2018 and interested in digital analytics, big data and what’s new in business intelligence then you’ll be interested in a free evening meetup we’re hosting that evening with our partners CIMA and Looker:

“Where Digital Analytics is Taking BI and Big Data”

How usage and new vendors in the digital, eCommerce and SaaS Analytics market are revolutionising the way in which data is analysed, democratising big data and enabling businesses to finally become “data-driven”.

  • Introduction by Mogens Norgaard, CIMA

  • Presented by Mark Rittman, CEO, MJR Analytics

Meetup hosted by CIMA Technologies, MJR Analytics and Looker

Date: Thursday 22nd November
Time: 17.00hrs - 19.00hrs
Location: Ingeniørforeningen IDA, Kalvebod Brygge 31-33 · 1780 København V

Registration is free, and there’ll be talks by myself and Looker with an introduction by none other than Mogens Norgaard. Just don’t let Mogens know it’s my birthday that day too, or at least not until I’ve finished my presentation and demo - hopefully we’ll see some of you in Copenhagen for the meetup, it’ll be good.

mjr analyticsComment