Cloud Computing, Big Data, Linked Data, Open Data, more
I first came across Infochimps as they set about building one of the early Data Market offerings. I recorded a couple of podcasts with CTO and co-founder Flip Kromer over the years, in 2009 and 2012, tracking some of the ways in which the company and the market were evolving. Since then the company has moved in a different direction, focusing far more attention upon the tools and services required to work with data, and far less upon offering a place for customers to find data. Infochimps was acquired by CSC last year, and dropped into the larger company’s big data business unit to accelerate it towards far greater market visibility and penetration. In this podcast, I talk with Infochimps CEO Jim Kaskade. We quickly cover some of the history, before looking in more detail at the ways in which Infochimps/ CSC can differentiate itself in an increasingly crowded market. A degree of technology agnosticism clearly helps here, and Jim talks about the way in which his team will help customers deploy any Hadoop distribution rather than being tied (as Cloudera or Hortonworks would be) to the home-grown offering. We also discuss recent investment in the space, with Jim noting that […]
The Data Platform Group at Microsoft does a lot, from SQL Server and their Hadoopey HDInsight offering through to Business Intelligence and analytics capabilities which sit in or on top of the humble Excel spreadsheet. I’ve touched upon pieces of this whole before, in a 2009 podcast on Azure with Amitabh Srivastava (then Corporate VP with responsibility for Azure), a 2012 podcast with Piyush Lumba (Director of Product Management for Azure Data Services), and recent short pieces on PowerBI. In this latest podcast, I talk with Quentin Clark to get a view on how the various pieces are starting to fit together. Quentin is Corporate VP with responsibility for the Data Platform Group, and our conversation quickly shows that there’s a lot going on there. In just over half an hour we barely scratch the surface, but some of the opportunities — and some of the challenges — certainly become apparent. Maybe we can revisit some of the specific areas of opportunity in future conversations… Image, ‘York Station approaches’, from All About Railways by F.S. Hartnell. Now in the public domain, and shared on Wikimedia Commons. Related articles The cloud first SQL Server 2014 coming in April with in-memory and cloud capabilities […]
With all the noise around newer technologies such as Hadoop, it would be easy to assume that the data analytics space is new — and totally dominated by the NewSQL/ NoSQL tools pouring out of the world’s startups. It would be easy. But it would also be wrong. Data analytics, business intelligence, and related ideas are not new. Companies such as Teradata (and others) have been selling enterprise tools here for 30 years or more, and these companies are proving resilient as the market shifts and evolves around them. In this podcast I talk with Scott Gnau, President of Teradata Labs. We discuss Teradata’s perspective on the evolving data analytics landscape, and Scott talks about some of the ways in which his company is adapting. Image from iStock. Related articles Teradata Sharpens Focus on In-database Analytics (datacenterknowledge.com) “We cannot be out innovated.” Teradata President Discusses Benefits of Partnerships #HadoopSummit (siliconangle.com) Netflix Picks Teradata Cloud for Analytics (datacenterknowledge.com)
It’s sometimes easy to assume that the large clusters of commodity servers commonly associated with open source big data and NoSQL approaches like Hadoop have made supercomputers and eye-wateringly expensive high performance computing (HPC) installations a thing of the past. But Adaptive Computing CEO Robert Clyde argues that the world of HPC has evolved, and that the machines in HPC labs now look an awful lot more like regular computers than they used to. They use the same x86-based chipsets, and they run the same (often Linux) operating systems. Furthermore, Clyde argues that techniques and ideas developed in the world’s elite HPC facilities have much to offer those running today’s enterprise data centres and grappling to cope with the new challenges posed by dealing with large volumes of data. In this podcast, Clyde discusses the lessons that HPC experience can bring to a new generation of big data problems, before going on to outline today’s software releases from Adaptive Computing. Image of a CRAY-XMP48 supercomputer at the EPFL (Lausanne, Switzerland) shared on Wikimedia Commons under Creative Commons licence. Original image by ‘Rama,’ cleaned by ‘Dake.’ Related articles The green supercomputer: Adaptive Computing is ensuring fast doesn’t mean wasteful (venturebeat.com) Video: How […]
As data volumes increase and marketing channels proliferate, corporate sales teams struggle to efficiently identify the real prospects worth an investment of their time. At Infer, a team experienced in applying predictive analytics to the web-scale problems faced by companies like Google and Yahoo! thinks they have a solution. By combining data from internal customer relationship systems with crawls of the public web, Infer offers its customers insights into the prospects most likely to be about to buy. Infer CEO Vik Singh joins me for this podcast, to explore the mysterious world of pre-sales and sales, and to illustrate some of the ways in which a data-driven approach can deliver value. Related articles Infer Raises $10 Million; Helps Companies Use Data to Win More Customers (diversity.net.nz) Software Predicts Which Companies Are an Easy Sell (technologyreview.com) Infer’s take on big data for lead generation is apparently all the rage (gigaom.com) A Win for Predictive Analytics : Infer Doubles Customer Bookings Since April (siliconangle.com) The VP of Sales – Does he Think or Know? (fliptop.com)
Big Data is undeniably hot right now, and to many Hadoop is inextricably linked to the broader Big Data conversation. And yet, Hadoop has a reputation for being complex, and unpolished, and difficult, and ‘technical,’ and a host of other less-than-glowing attributes which might cause potential users to pause and take stock. Some of that reputation is, perhaps, undeserved, and many of those limitations are actively being addressed within the Apache Foundation‘s open source Hadoop projects. But there is clearly an opportunity for intermediaries who understand Hadoop, who can make it perform, and who can actively contribute back to those Apache projects. MapR Technologies is one of the better known of those intermediaries (alongside others such as Cloudera, which we also discuss), and the company has done much to encourage adoption and real use of Hadoop beyond the Silicon Valley bubble in which it emerged. In this podcast I talk with MapR Technologies CEO and co-founder, John Schroeder, to learn a little more about his company’s approach and to gain his insight into the ways in which Big Data technologies such as Hadoop are being deployed at scale to address real business challenges. Image of the engraving Slag bij Zama tussen Scipio […]
That global accountancy giant KPMG should be interested in data is not, perhaps, surprising. That KPMG would use its own money to “invest in, partner with and acquire organisations that specialise in data and analytics tools and assets” was less immediately obvious to me. But that, according to KPMG global lead for data and analytics Mark Toon, is exactly what KPMG Capital was set up to do last year. Mark is CEO of the new investment fund and joined me this week for a podcast to discuss some of the data-based issues currently facing KPMG and its clients. The new firm commissioned a survey last summer, interviewing 144 CFOs and CIOs from big multinationals (with revenues above $1billion). This survey (PDF) was published last month, and offers a useful starting point for many of the topics Mark and I explore. Image by Flickr user Ken Teegardin of www.seniorliving.org. Related articles KPMG launches data and analytics investment fund with strong roots in Houston (bizjournals.com) KPMG Capital Study Reveals 96 Percent of Businesses Say They Are Not Managing Data Effectively (virtual-strategy.com) KPMG Launches Big Data Analytics Investment Arm To Gain Rapid Market Entry (forbes.com) Can’t KPMG Just Do Better Audits? – Bloomberg (bloomberg.com)
Portland-based Orchestrate (orchestrate.io) rolls out its commercial NoSQL offering today, claiming to significantly decrease the time, cost and complexity of putting cloud-based data to work. I took the chance to speak with co-founder and CEO (and former Basho co-founder) Antony Falco, to learn more about the company and the problems it’s seeking to address. Our chat ended up becoming quite a wide-ranging discussion of the world of databases, and it’s embedded here as a podcast. The Orchestrate team suggests that a ‘typical’ web application today can use as many as five different databases to store and process diverse data types, or to interact with multiple sensors and other forms of data input. Orchestrate sets out to simplify that complexity, by layering a common and developer-friendly RESTful API on top of the underlying database technologies. Antony discusses some of the ways in which Orchestrate seeks to reduce complexity without abstracting away the power of the underlying tools. Orchestrate’s solution currently runs in Amazon’s US cloud infrastructure, with other clouds and other geographies being actively explored. Antony notes during our call that the company has seen a great deal of interest from beyond the United States, particularly from Europe.
I used to podcast pretty regularly, on this site and elsewhere. Then other things got in the way and, before I knew it, almost two years had passed since my last podcast here. Well, it’s time to put that right. I’m podcasting again, and I’ve got a nice pipeline of guests lined up over the next few weeks. First up, and breaking me in gently, is another lapsed podcaster; Marc Farley. Marc is a Senior Product Marketing Manager at StorSimple, a storage company acquired by Microsoft back in 2012. He’s also pretty active on Twitter, and probably well known to those of you in the storage world. We talk about Microsoft (“software company”) and its interest in StorSimple (“hardware company”), before moving on to look at the broader implications of storing stuff in the cloud for a long, long time. We touch on some of the issues raised in a recent post of mine, looking at the value of cloud storage to the archival community. Have a listen, and let us know what you think… And do let me know who else you’d like to hear from, and on which topics. Microphone image by Flickr user John Schneider Related articles […]
The Open Knowledge Foundation (OKFN) promotes the creation, dissemination and use of ‘open knowledge.’ As part of this activity OKFN developed a data repository called CKAN, and has seen this become increasingly important to a range of data dissemination activities such as data.gov.uk and publicdata.eu. In this podcast I talk with OKFN Director Rufus Pollock and CKAN Product Owner Irina Bolychevsky, to learn more about CKAN, its use in the context of open data, and the wider implications for dissemination of any data (whether open or closed). Following up on a blog post that I wrote at the start of 2012, this is the tenth in an ongoing series of podcasts with key stakeholders in the emerging category of Data Markets. Related articles Open Knowledge Releases Open Data Handbook 1.0 (readwriteweb.com) Data Market Chat: Leigh Dodds discusses Kasabi (cloudofdata.com) Data for the public good (radar.oreilly.com) Data Market Chat: Stephen O’Grady of RedMonk examines the bigger picture (cloudofdata.com) Data Market Chat: Nick Edouard discusses BuzzData (cloudofdata.com)