Mark Rittman is joined in this episode by Greg Michaelson from DataRobot, talking about the benefits of automating the discovery and automation of analytics and machine learning in financial services and other industries.
Mark Rittman is joined by Will Davis from Trifacta to talk about the public beta of Google Cloud Dataprep, Trifacta's data wrangling platform and topics including metadata management, data quality and data management for big data and cloud data sources.
- Google Cloud Dataprep on Google Cloud Platform
- "Google Cloud Dataprep: Spreadsheet-Style Data Wrangling Powered by Google Cloud Dataflow"
- "A New Cloud-Based Data Prep Solution from Google & Trifacta"
- Trifacta website
- "A Breakthrough Approach to Exploring and Preparing Data"
- Trifacta platform architecture
- "Garbage In, Garbage Out: Why Data Quality Matters"
- "How to Put an Effective Metadata Strategy in Place"
Drill to Detail returns after the New Year break with Special Guest Julian Hyde from Hortonworks to talk about bitmap indexes and CASE tools, Mondrian and open-source OLAP analysis, and Apache Calcite's mission to bring sanity, cost-based optimisers and support for OLAP workloads to today's dis-aggregated, distributed new-world database engines.
- Oracle Designer page on Oracle.com
- Bitmap Index page on Wikipedia
- Mondrian project page on Github
- Mondrian OLAP Server page on Wikipedia
- MultiDimensional eXpressions (MDX) page on Wikipedia
- Julian Hyde blog
- Apache Calcite project homepage
- Apache Calcite Introduction and Overview deck
- Streaming SQL presentation at Apex Big Data World 2017, Mountain View, California
Mark Rittman is joined in this episode of Drill to Detail by Dr. Carsten Bange from BARC to talk about findings from the recently completed BI Survey 17 including the continuing move to modern BI platforms and self-service desktop tools, analytics adoption trends and the increasing incorporation of BI functionality within business applications, the surprising topicality of master data management and data governance ... and whatever happened to Nigel Pendse and his legendary OLAP Report?
- The BI Survey 17: The World’s Largest Annual Survey of BI Users
- Master Data and Data Quality Management Now the #1 Trend in BI
- BI Trend Monitor 2018 Infographic: The Evolution of Trends
- The Business Intelligence Industry Continues Its Ongoing Empowerment of Business Users
- The OLAP Report: The origins of today’s OLAP products (c. 2005, from the Internet Archive)
Mark Rittman is joined in this episode by returning special guest Jen Underwood to talk about what's new and innovative in the BI and analytics industry right now, and how AI and machine learning are this year's data discovery and data visualization.
- "Between The Lines At Tableau Conference" - Jen Underwood.com blog
- "Transform The Business With Automated Embedded Artificial Intelligence" - JenUnderwood.com blog
- "Moving From Bi To Machine Learning With Automation" - JenUnderwood.com blog
- "How Smart Data Discovery Will Radically Transform Analytics" - Tellius Webinar with Jen Underwood
- YellowFin BI - homepage
- Paxata - homepage
- "Drill To Detail Ep.8 'Self-Service BI, Data Prep & Big Data Vendor Strategy' With Special Guest Jen Underwood"
Mark Rittman is joined in this episode by Taylor Brown from Fivetran to talk about middleware for SaaS data, their focus on integrations with SaaS vendors and how this differentiates their offering, his thoughts on packaged analytic applications announced at the recent Looker Join conference ... and where the name "Fivetran" came from.
Drill to Detail returns for a new season with special guest Jean-Pierre Dijcks, to talk about Oracle's Big Data Strategy now and in the past, thoughts on distributed query and storage in the cloud, and previewing themes and announcements to look forward to at the upcoming Oracle Open World 2017 event running in San Francisco next month
Mark Rittman is joined by Industry Analyst Mark Madsen to talk about marketing analytics and the rise of the omni-channel consumer, the use of AI in analytics and personalization and what this all means for brands, for advertisers and for marketers.
In this episode Mark is joined by Jake Stein to talk about Stitch Data and their ETL tool for data engineers, the new open-source project Singer and his experiences building a software startup that both partners and competes with the big cloud platform vendors.
- Stitch Data
- Singer: Simple, Composable Open-Source ETL
- Setting the Data Strategy for Your Growing Organization
- The State of Data Engineering
- The State of Data Science
- Why our ETL Tool Doesn't Do Transformations
- Airflow: a workflow management platform
- Goodbye RJMetrics, Hello Fishtown Analytics
- Engineers Shouldn’t Write ETL: A Guide to Building a High Functioning Data Science Department
Mark Rittman is joined by Donald Farmer to talk about his work at Microsoft on SQL Server Analysis Services and Integration Services, why he moved to Qlik and the challenges of evolving a BI product strategy from focusing on desktops to focusing on the enterprise, and some advice for customers, software vendors and partners working with data and analytics tools.
Mark is joined by returning special guest Dan McClary to talk about data modeling and database design on distributed query engines such as Google BigQuery, the underlying Dremel technology and columnar storage format that enables this cloud distributed data warehouse-as-a-service platform to scale to petabyte-size tables spanning tens of thousands of servers, and techniques to optimize BigQuery table joins using nested fields, table partitioning and denormalization.
- Dremel: Interactive Analysis of Web-Scale Datasets
- BigQuery under the hood
- Inside Capacitor, BigQuery’s next-generation columnar storage format
- Drill To Detail Ep.2. 'Future Of SQL On Hadoop', With Special Guest Dan McClary
- Google BigQuery, Large Table Joins and How Nested, Repeated Values and the Capacitor Storage Format (and Looker) Saves the Day
Oracle's Jack Berkowitz joins Mark Rittman to talk about a new category of continuously adapting, self-learning applications being built-out by Oracle that use machine learning together with enterprise and third-party data to create a new generation of intelligent HR, CX, SCM and ERP SaaS apps.
Stewart Bryson returns to the show to join Mark Rittman to discuss new-world BI and data warehousing development using Google BigQuery and Amazon Athena, Apache Kafka and StreamSets, and talks about his experiences with Looker, the cloud-native BI tool that brings semantic modeling and modern development practices to the world of business intelligence.
Mark Rittman is joined by Gwen Shapira from Confluent to talk about Apache Kafka, streaming data integration and how it differs from batch-based, GUI-developed ETL development, the problem with architects, exactly-once processing and how data governance is coming to Kafka development with Confluent's new schema registry server.
Mark Rittman is joined by Kevin Madden and Josh Feingold to talk about graph + spatial analytics, Tom Sawyer Software ... and why a tweet about a certain WiFi kettle incident went viral last October.
- Visualizing When a Tweet Goes Viral
- How a Tweet Went Viral - BIWA Summit 2017
- English man spends 11 hours trying to make cup of tea with Wi-Fi kettle (The Guardian)
- The iKettle, the Eleven-Hour Struggle to Make a Cup of Tea, and Why It Was All About Data, Analytics and Connecting Things Together
- Tom Sawyer Software Perspectives
Mark Rittman is joined by Craig Stewart to talk about application and data integration, ODI and Sunopsis, SnapLogic's approach to hybrid on-premise/cloud integration and the rise of data preparation and dataflow-based cloud integration tools.
Mark Rittman is joined by Independent Consultant Chris Webb to talk about MDX & DAX, MSAS and SQL SQL Server and the fall ... and rise, of Microsoft BI
Mark Rittman is joined in this episode by MapR's Tugdall Grall to talk about MapR's platform differentation and relationship with open-source Hadoop, scaling and streaming, microservices, and MapR's platform strategy around big data workloads in the cloud.
Mark Rittman is joined by Elastic's Mark Walkom to talk about Elasticsearch, Kibana, Logstash and the Elastic Stack; business models built-around an open-source software core; and their move into cloud services with Elastic Cloud
Mark Rittman is joined by Vasu Murthy, Oracle's Senior Director for Product Management of Oracle Business Analytics to talk about what's new with OBIEE and Oracle Data Visualization and the recently released Oracle Analytics Cloud, a dive into the technical architecture of these new additions to Oracle's BI platform, and Oracle's vision for hybrid on-prem/cloud analytics.