Distributed data processing framework
POPULARITY
Ari Zilka is CEO of mydecisive.ai, the general-purpose observability engine built on OpenTelemetry. Ari was previously the CTO of Hortonworks which built products on top of open source Apache Hadoop and merged with Cloudera in 2019. In this episode, we dig into the similar patterns Ari sees between Hortonworks / Hadoop and mydecisive.ai / OpenTelemetry, why large enterprises don't want their data to be held hostage and are shifting towards OpenTelemetry, how open source switches costs from vendors to engineering, why he's focused on building the community before monetizing, his view on monetization ("the developer bakes you in, the operator pays for it"), why the "ops" side of "devops" carries the money, and the learnings from building Hortonworks that he's bringing into mydecisive.ai.
[Referências do Episódio] Spinning YARN - A New Linux Malware Campaign Targets Docker, Apache Hadoop, Redis and Confluence - https://www.cadosecurity.com/spinning-yarn-a-new-linux-malware-campaign-targets-docker-apache-hadoop-redis-and-confluence/ Unveiling Earth Kapre aka RedCurl's Cyberespionage Tactics With Trend Micro MDR, Threat Intelligence - https://www.trendmicro.com/en_us/research/24/c/unveiling-earth-kapre-aka-redcurls-cyberespionage-tactics-with-t.html z0Miner Exploits Korean Web Servers to Attack WebLogic Server - https://asec.ahnlab.com/en/62564/ About the security content of iOS 17.4 and iPadOS 17.4 - https://support.apple.com/en-us/HT214081 Roteiro e apresentação: Carlos Cabral e Bianca Oliveira Edição de áudio: Paulo Arruzzo Narração de encerramento: Bianca Garcia
Podcastfolge zur Verwaltung und Speicherung von kalten Daten im Unternehmen. Unstrukturierte Massendaten mit Tape, File- und Object Storage effizient verwalten. Zum Inhalt dieser Episode: Folgt man Untersuchungen wie dem aktuellen IDC Global DataSphere Report, soll sich das unternehmensbezogene Datenaufkommen in den nächsten 5 Jahren verdoppeln. Wir sprechen beim Unternehmenseinsatz hier von 80% und mehr an unstrukturierten- sowie semistrukturierten Daten. Davon wird in der Praxis aktuell übrigens nur ein kleiner einstelliger Prozentsatz für Analysezwecke verwendet- wo doch eigentlich der Wert der Daten als strategisches Asset für die digitale Transformation im Mittelpunkt stehen sollte. Es besteht also dringend Handlungsbedarf, denn die Analysen von historischen und anderen Daten, auf die nur selten zugegriffen wird können helfen, wettbewerbsfähiger zu bleiben. Technologien wie Apache Hadoop ermöglichen die kosteneffiziente Durchführung von Big-Data-Analysen auf Basis von Open-Source-Technologien und die dazu geeignete Speichertechnologie entscheidet auch über den Projekterfolg. Nur ist es für viele gerade kleinere und mittlere Unternehmen aufgrund von Budgetbeschränkungen und fehlenden Fachkräften schwierig, diese Ziele mit rein internen Plattformen vor-Ort im eigenen Datacenter zu erreichen. Public Cloud-Speicher andererseits bringen andere Herausforderungen in Bezug auf die Datenhoheit und -kontrolle mit sich. Auch die Zugriffs- und Speichergebühren, die beim Zugriff auf kalte Daten teuer werden können sind nicht zu unterschätzende Faktoren. In dieser Episode erfahren Sie, weshalb kalte Daten aus Umwelt- und Kostengründen ein Thema sind. Es wird auch darauf eingegangen, wie diese energieeffizient gespeichert und verwaltet werden können. Weitergehende Informationen zum Thema mit Verweisen zu IT-Technologien und Anbieterlösungen finden Sie auf unserer Webseite.
Nesse episódio com os dois maiores especialistas do Brasil sobre esse assunto, Thiago Santiago e Gustavo Gattass, falamos sobre a nova plataforma de dados da Cloudera, como sempre trazendo inovação no mercado de Big Data e Analytics. Doug Cutting, criador do famoso sistema Apache Hadoop fez com que tudo fosse possível em 2006 para processamento de dados massivo e agora, a nova plataforma da Cloudera unificada CDP, traz os seguintes grandes benefícios para seus consumidores:Nuvem HíbridaCloudera SDX para Plataforma de Deployment Unificada com KubernetesEngenharia e Ciência de Dados como Produto de Entrega UnificadaData Warehouse e Visualização de DadosEntenda o futuro da Engenharia e Ciência de Dados em uma plataforma aonde se tem como principal objetivo a entrega de uma solução completa fim a fim, embarque no Cloudera CDP.Thiago Santiago = https://www.linkedin.com/in/thiagosantiago/ Gustavo Gattas = https://www.linkedin.com/in/ggattass/ No YouTube possuímos um canal de Engenharia de Dados com os tópicos mais importantes dessa área e com lives todas as quartas-feiras.https://www.youtube.com/channel/UCnErAicaumKqIo4sanLo7vQ Quer ficar por dentro dessa área com posts e updates semanais, então acesse o LinkedIN para não perder nenhuma notícia.https://www.linkedin.com/in/luanmoreno/ Disponível no Spotify e na Apple Podcasthttps://open.spotify.com/show/5n9mOmAcjra9KbhKYpOMqYhttps://podcasts.apple.com/br/podcast/engenharia-de-dados-cast/ Luan Moreno = https://www.linkedin.com/in/luanmoreno/
When it comes to Apache Kafka®, there's no one better to tell the story than Jay Kreps (Co-Founder and CEO, Confluent), one of the original creators of Kafka. In this episode, he talks about the evolution of Kafka from in-house infrastructure to a managed cloud service and discusses what's next for infrastructure engineers who used to self-manage the workload. Kafka started out at LinkedIn as a distributed stream processing framework and was core to their central data pipeline. At the time, the challenge was to address scalability for real-time data feeds. The social media platform's initial data system was built on Apache™Hadoop®, but the team later realized that operationalizing and scaling the system required a considerable amount of work. When they started re-engineering the infrastructure, Jay observed a big gap in data streaming—on one end, data was being looked at constantly for analytics, while on the other end, data was being looked at once a day—missing real-time data interconnection. This ushered in efforts to build a distributed system that connects applications, data systems, and organizations for real-time data. That goal led to the birth of Kafka and eventually a company around it—Confluent.Over time, Confluent progressed from focussing solely on Kafka as a software product to a more holistic view—Kafka as a complete central nervous system for data, integrating connectors and stream processing with a fully-managed cloud service.Now as organizations make a similar shift from in-house infrastructure to fully-managed services, Jay outlines five guiding points to keep in mind: Cloud-native systems abstract away operational efforts for you without infrastructure concernsIt's important to have a complete ecosystem for Kafka, including connectors, a SQL layer, and data governanceA distributed system should allow data to be accessible everywhere and across organizationsIdentifying a reliable storage infrastructure layer that is dependable, such as Amazon S3 is criticalCost-effective models mean sustainability and systems that are easy to build aroundEPISODE LINKSBuilding Real-Time Data Systems the Hard WayKris Jenkins TwitterThe Hitchhiker's Guide to the GalaxyHedonic treadmillWatch the video version of this podcastJoin the Confluent CommunityLearn more with Kafka tutorials, resources, and guides at Confluent DeveloperLive demo: Intro to Event-Driven Microservices with ConfluentUse PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)
This episode marks an exciting new season and new format of The Peter O. Estévez Show. We'll be bringing in a lot of celebrity guests as well as a couple of co-hosts. The guest of this episode is Chris Mattmann. Chris is the IT Chief Technology and Innovation Officer at NASA JPL. His work has helped NASA explore space and help journalists in government track international financial crime amongst the world's elite across the globe. Chris is a frequent keynote speaker in government, academia, industry, and his work has helped me find the field of data science. Listen to learn more about Chris' incredible, worldwide impact.You will want to hear this episode if you are interested in...What is Apache Hadoop? [03:20]Chris' journey to NASA [07:04]Fighting financial crime with data [13:20]Chris' role and future at Nasa [19:55]Advice for the crypto space [24:17]What happened to UFOs? [28:37]The future of automation [30:58]NASA's response to the pandemic [34:18]Resources & People MentionedApache HadoopNASA.govThe Panama Papers: What You Should KnowFlippy – Miso RoboticsMachine Learning with TensorFlow, Second EditionConnect with Chris MattmannTheir websiteOn InstagramOn TwitterOn LinkedinConnect With Peter O. Estévezwww.peteroestevezshow.com Follow on Facebook Follow Coming Clean on InstagramFollow Peter on InstagramSubscribe to Coming Clean onApple Podcasts, Spotify, Google PodcastsAudio Production and Show notes byPODCAST FAST TRACKhttps://www.podcastfasttrack.com
Please join us in The BreakLine Arena for a conversation with Ali Ghodsi, CEO and Co-founder of Databricks.As Chief Executive Officer, Ali responsible for the growth and international expansion of the company. He previously served as the VP of Engineering and Product Management before taking the role of CEO in January 2016. In addition to his work at Databricks, Ali serves as an adjunct professor at UC Berkeley and is on the board at UC Berkeley's RiseLab.Ali was one of the original creators of open source project, Apache Spark, and ideas from his academic research in the areas of resource management and scheduling and data caching have been applied to Apache Mesos and Apache Hadoop. Ali received his MBA from Mid-Sweden University in 2003 and PhD from KTH/Royal Institute of Technology in Sweden in 2006 in the area of Distributed Computing.If you like what you've heard, please like, subscribe, or follow our show. To learn more about BreakLine Education, check us out at breakline.org.
When I was given the opportunity to chat with Chris, I wasn't going to pass up on it! However, what to chat about was the tough part? When you read his bio below - you'll understand this is one truly busy man, I had so many things to cover with him! Listen in to hear about the programmes they are managing, how the teams work and a whole lot more. I'll be talking more tech with Chris in a follow up podcast in the next couple of months. Chris Mattman Chris Mattmann is an experienced IT Executive, CTO and Division Manager of the AI, Analytics and Innovative Development Organization in the Information Technology and Solutions Directorate at NASA JPL. At JPL Mattmann is the Chief Technology and Innovation Officer and reports to the CIO and Director for IT, and manages advanced IT research and open source and technology evaluation and user infusion capabilities. Mattmann is JPL's first Principal Scientist in the area of Data Science. The designation of Principal is awarded to recognize sustained outstanding individual contributions in advancing scientific or technical knowledge, or advancing the implementation of technical and engineering practices on projects, programs, or the Institution. He has over 20 years of experience at JPL and has conceived, realized and delivered the architecture for the next generation of reusable science data processing systems for NASA's Orbiting Carbon Observatory, NPP Sounder PEATE, and the Soil Moisture Active Passive (SMAP) Earth science missions. Mattmann's work has been funded by NASA, DARPA, DHS, NSF, NIH, NLM and by private industry and commercial partnerships. He was the first Vice President (VP) of Apache OODT (Object Oriented Data Technology), the first NASA project at the Apache Software Foundation (ASF) and he led the project's transition from JPL to the ASF. He contributes to open source and was a member of the Board of Directors at the Apache Software Foundation (2013-18) where he was one of the initial contributors to Apache Nutch as a member of its project management committee, the predecessor to Apache Hadoop. Mattmann is the progenitor of the Apache Tika framework, the digital "babel fish" and de-facto content analysis and detection framework that exists. Today Mattmann contributes to TensorFlow, Google's technology platform for all things machine learning and has recently finished a book on Machine Learning for TensorFlow, 2nd edition published by Manning Publications. Mattmann is the Director of the Information Retrieval & Data Science (IRDS) group at USC and Adjunct Research Professor. He teaches graduate courses in Content Detection & Analysis & in Search Engines & Information Retrieval. Mattmann has materially contributed to understanding of the Deep Web and Dark Web through the DARPA MEMEX project. Mattmann's work helped uncover the Panama Papers scandal which won the Pulitzer Prize in Journalism in 2017. Twitter
Big data and analytics play a critical role in every organization’s digital transformation. For many customers, it’s the first initiative they embark on to accelerate time to insight and pace of innovation for their business. Today, Simon is joined by Roy Hasson, Sr. Manager WW Analytics Specialist, to talk about how customers can modernize on-premises, self-managed Apache Hadoop environments running Apache Hive and Apache Spark using Amazon EMR - a fully managed, easy to use, unified analytics platform to run your Hive, Spark, Presto and other big data and ML workloads. They also discuss the EMR Migration Program, developed to help customers quickly modernize their on-premises, self-managed Hadoop environments.
The Twenty Minute VC: Venture Capital | Startup Funding | The Pitch
Ali Ghodsi is the Founder & CEO @ Databricks, bringing together data engineering, science and analytics on an open, unified platform so data teams can collaborate and innovate faster. To date, Ali has raised over $897M for the company including from the likes of a16z, NEA, Microsoft, Battery, Coatue, Greenbay and more. Prior to Databricks, Ali was one of the original creators of open source project, Apache Spark, and ideas from his research have been applied to Apache Mesos and Apache Hadoop. In Today’s Episode You Will Learn: 1.) How Ali made his way from fleeing Iran as a refugee to living in a Swedish ghetto? What was the founding moment for Ali with Databricks? 2.) How does Ali think about and evaluate risk today? Why does Ali always make his team do downside scenario planning? How does Ali think about his relationship to money today? Why does Ali disagree with gut decisions? What is his process for making decisions effectively? 3.) Stage 1: The Search for PMF: What are the core elements included in this phase? What types of leaders thrive in this phase? What type struggle? How can leaders sustain morale in the early days when it is not up and to the right? Who are the crucial hires in this phase? 4.) Stage 2: Scale Go-To-Market: What are the core roles needed to expand GTM fast and effectively? Why should you hire sales leaders before marketing leaders? Why is hiring finance leaders so crucial here? What mistakes are most often made here? How do the board resolve them? 5.) Stage 3: Process and Efficiency: What are the first and most important processes that need to be implemented? How does Ali need to change the type of leader he is to fit this stage? How does one retain creativity and nimble decision-making at scale and with process? Item’s Mentioned In Today’s Episode Ali’s Favourite Book: Good Strategy Bad Strategy: The Difference and Why it Matters As always you can follow Harry and The Twenty Minute VC on Twitter here! Likewise, you can follow Harry on Instagram here for mojito madness and all things 20VC.
Should an Aspiring Data Scientist Learn More About Big Data? No. And in this episode, I'll tell you why. Youtube version: https://www.youtube.com/watch?v=StpXg0frfGE Article format here: https://data36.com/big-data-junior-data-scientist/ LINKS MENTIONED IN THE EPISODE: Image, here: https://data36.com/big-data-junior-data-scientist/ Apache Hadoop: https://hadoop.apache.org/ Apache Spark: https://spark.apache.org/ Python API (Spark): https://spark.apache.org/docs/latest/quick-start.html#self-contained-applications SparkSQL: https://spark.apache.org/docs/latest/sql-programming-guide.html PySpark with Pandas: https://spark.apache.org/docs/latest/sql-pyspark-pandas-with-arrow.html Newsletter: https://data36.com/newsletter Free mini-course: https://data36.com/how-to-become-a-data-scientist/ IMAGE SOURCES: - own presentations Check my website: https://data36.com Get access to more data science tutorials, join the inner circle: https://data36.com/inner-circle Find me on Twitter: https://twitter.com/data36_com
Apache Hadoop.HDFS.Apache Hive.Apache Spark.Presto.Architecture Of Giants: Data Stacks At Facebook, Netflix, Airbnb, And Pinterest.Data Wrangling.Null++ Docker Episode.Julia Language.kaggle.SED Podcast, Episode: Slack Data Platform with Josh Wills.Article Software 2.0.Aya's Recommendation for learning:Towards data science.Statistics and Data Science MicroMasters.DataCamp.Udemy: Python for Data Science and Machine Learning Bootcamp.Coursera's Deep Learning Specialization.Lex Fridman Artificial Intelligence Podcast & YouTube channel.Episode Notes:Aya: How To lie with statistics book.Luay: Great Expectations Data Pipeline Testing Framework.Alfy: JAM Stack.
“You are building features that your customers need right here, right now. You educate them on what can be done and the art of the possible. Then, you execute way better than your competitors. That's it.” — Amr Awadallah Today, Stephanie is joined by Amr Awadallah, co-founder and Global Chief Technology Officer of Cloudera. Amr is fondly called the “geek from Egypt” as he migrated from Egypt and has a passion for computer science. Living in Egypt, Amr never aspired to become an entrepreneur; it wasn’t until he moved to the United States and graduated from college that mindset changed. After taking a leave of absence from Stanford, Amr started his first successful company, Viva Smart, which he later sold to Yahoo. Amr joined Yahoo to learn the ropes of a corporate setting and eventually was running one of the very first organizations to use Apache Hadoop for data analysis and business intelligence. Once Amr had experienced the difference Hadoop had made when testing it out with his team, he knew this was a great opportunity to start another company, thus creating, Cloudera. On this episode of Mission Daily, Stephanie and Amr sit down to discuss the problems AI is solving on a daily basis, his journey from Egypt to the US, and launching his two companies. — Mission Daily and all of our podcasts are created with love by our team at Mission.org. We own and operate a network of podcasts, and brand story studio designed to accelerate learning. Our clients include companies like Salesforce, Twilio, and Katerra who work with us because we produce results. To learn more and get our case studies, check out Mission.org/Studios. If you’re tired of media and news that promotes fear, uncertainty, and doubt and want an antidote, you’ll want to subscribe to our daily newsletter at Mission.org. When you do, you’ll receive a mission-driven newsletter every morning that will help you start your day off right!
In diesem Podcast geht es um das Thema: Das Hadoop Ökosystem - Was leistet die bekannteste Big Data Platform? // Inhalt // 1. Was ist Apache Hadoop? 2. Wer hat Hadoop entwickelt? 3. Welche Unternehmen setzen Hadoop ein? 4. Wie funktioniert Hadoop im Detail? 5. Welche Erfahrungen hat Skillbyte mit Apache Hadoop gemacht? Sprungmarken: 01:17 -> Was ist Apache Hadoop? 04:20 -> Für welche Unternehmen ist Hadoop interessant? 07:38 -> Zusammensetzung der Hadoop Distribution 10:17 -> Wie funktioniert Hadoop im Detail? 18:47 -> Apache Hive im Detail 24:40 -> Apache Spark im Detail 31:00 -> Wie können Firmen das Hadoop Ökosystem einsetzen? 34:50 -> So stellen sich Deutsche Unternehmen aktuell im Big Data Umfeld auf 40:20 -> Welche Erfahrungen hat Skillbyte mit Apache Hadoop gemacht? Abonnieren Sie diesen Podcast und besuchen Sie uns auf https://www.skillbyte.de Feedback und Fragen gerne an podcast@skillbyte.de
What exactly is data science? Is it a buzzword? A fancy way of saying advanced statistics and analytics? Or is it a mystery? Well, let’s find out.This podcast covers:What data science isWhat the data science life cycle isAnd what tools data scientists use to do their work.There are 5 stages to the data science process:Stage 1: Capturing the DataStage 2: Maintaning the DataStage 3: Processing the DataStage 4: Analyzing the DataStage 5: Communicating the DataClick here to read the full blog: https://evidencen.com/what-is-data-science-a-mystery-or-not/Visit my website: https://evidencen.com/My Data science and software engineering Youtube page: https://www.youtube.com/evidencenExclusive Data Science Community: https://datascienceacademy.mn.co/Follow me on twitter: https://twitter.com/evidencenmediaMy linkedin page: https://www.linkedin.com/in/evidencen/Follow me on instagram: https://www.instagram.com/evidencenmedia/Contact me: https://evidencen.com/contact-me/Suggest your content idea: https://evidencen.com/suggestions/
The authors of Machine Learning for Dummies – Judith Hurwitz, and Daniel Kirsch — are here to help you.In this episode, Judith, Daniel and Al discuss the state of machine learning today, how to use it to advance your business as well as discoveries they made while writing their book. Learn how small and large businesses alike can find insights from data to enhance relationships with customers. We’ll also share where you can get a copy of Machine Learning for Dummies at no cost.Show notes01.00 Connect with Al Martin on Twitter and LinkedIn.01.10 Connect with Kate Nichols on Twitter and LinkedIn.01.15 Connect with Fatima Sirhindi on Twitter and LinkedIn.02.00 Learn more about Hurwitz & Associates.02.10 Connect with Judith Hurwitz on Twitter, LinkedIn and find her blog here.03.20 Connect with Daniel Kirsch on Twitter and Hurwitz & Associates04.00 Read Machine Learning for Dummiesby Judith Hurwitz and Daniel Kirsch.04.40 Learn what neural nets are here.04.50 Learn more about Arthur Samuel here.05.00 Learn more about how Deep Blue beat the world chess champion.15.39 Learn more about Apache Hadoop. 17.30 Learn more about IBM Watson.26.50 Find Cognitive Computing and Big Data Analytics by Judith Hurwitz, Marcia Kaufman and Adrian Bowles.27.45 FindEverybody Lies: Big Data, New Data and What the Internet Can Tell Us About Who We Really Are by Seth Stephens-Davidowitz.
The authors of Machine Learning for Dummies – Judith Hurwitz, and Daniel Kirsch — are here to help you.In this episode, Judith, Daniel and Al discuss the state of machine learning today, how to use it to advance your business as well as discoveries they made while writing their book. Learn how small and large businesses alike can find insights from data to enhance relationships with customers. We’ll also share where you can get a copy of Machine Learning for Dummies at no cost.Show notes01.00 Connect with Al Martin on Twitter and LinkedIn.01.10 Connect with Kate Nichols on Twitter and LinkedIn.01.15 Connect with Fatima Sirhindi on Twitter and LinkedIn.02.00 Learn more about Hurwitz & Associates.02.10 Connect with Judith Hurwitz on Twitter, LinkedIn and find her blog here.03.20 Connect with Daniel Kirsch on Twitter and Hurwitz & Associates04.00 Read Machine Learning for Dummiesby Judith Hurwitz and Daniel Kirsch.04.40 Learn what neural nets are here.04.50 Learn more about Arthur Samuel here.05.00 Learn more about how Deep Blue beat the world chess champion.15.39 Learn more about Apache Hadoop. 17.30 Learn more about IBM Watson.26.50 Find Cognitive Computing and Big Data Analytics by Judith Hurwitz, Marcia Kaufman and Adrian Bowles.27.45 FindEverybody Lies: Big Data, New Data and What the Internet Can Tell Us About Who We Really Are by Seth Stephens-Davidowitz.
In Folge 13 dreht sich alles um große Datenmengen und ihre Verarbeitung. Dominik Benz erklärt wie Data Engineers Datenstrecken entwickeln und welchen Einfluss aktuelle Entwicklungen wie das Bestreben nach Echtzeitdaten, die DSGVO und Big Data haben. Wir erklären fachliche Grundlagen, wie den Unterschied zwischen System- und Processing Time sowie die Problematik und den Umgang mit den daraus resultierenden “Late Arrivals”. Außerdem widmen wir uns natürlich den wichtigsten Technologien des Big Data Kosmos wie etwa Apache Hadoop, ETL Tools wie Spark- und Nifi sowie dem Message Broker Apache Kafka.
In this last Big Data news episode for the month of November, we look forward to the H2O World event next week in London and we have articles on BI Maturity and the upcoming Apache Ozone project that will supplant HDFS in future Hadoop clusters soon(TM). BI Maturity: You can’t get there from here! http://makingdatameaningful.com/bi-maturity/ Introducing Apache Hadoop Ozone: An Object Store for Apache Hadoop https://hortonworks.com/blog/introducing-apache-hadoop-ozone-object-store-apache-hadoop/ Katacoda example down on this page https://hadoop.apache.org/ozone Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
Changing the “culture” at a large company is impossibly hard, few get through it. And, it’s little wonder, you’re usually asking them to do completely irrational things. In the context of Google shutting down Google+ and a small write-up of Blockbuster failure fairy tales, we spend time discussion the “if it ain’t broke, don’t fix it” problem of digital transformation. We then talk about Elastic search and their recent IPO, and follow-up with some better commentary on Cloudera and Hortonworks merging - better than we did last week. Hotel breakfast buffet strategies and the Chase Sapphire series of cards. Oh, and before that Matt and Coté spend a good 10 to 15 minutes talking about hotel breakfast buffet strategies. Also, it’s episode #150 - yay us! Our first episode was on May 27th, 2014, where Coté’s lamp played a prominent role, and we did video (https://www.youtube.com/watch?v=4S0_PzuYJJE&index=58&list=PLk_5VqpWEtiWnQ7od08nzkB32oT4gnDiP). Relevant to your interests Chase Sapphire Reserve (https://creditcards.chase.com/a1/sapphire/reserve), and others in the Sapphire line (https://www.chase.com/personal/credit-cards/sapphire-on-location). AAdvantage Executive card (https://secure.fly.aa.com/citi/direct-exec?anchorLocation=DirectURL&title=citiexecutive). SpringOne Platform videos (https://www.youtube.com/playlist?list=PLAdzTan_eSPQsR_aqYBQxpYTEQZnjhTN6&disable_polymer=true) are all up. Coté went to Puppetizer 2018 Amsterdam. They’re really into being “a portfolio company” (https://www.instagram.com/p/BovoMzaCxsJ/?taken-by=bushwald) now. Lots of stacks presented (http://cote.coffee/2018/10/10/thats-some-stack.html); much discussion on managing Puppet itself. A very well run event. See also Register (https://www.theregister.co.uk/2018/10/09/puppet_data_exhaust/) coverage of their SF event (https://www.theregister.co.uk/2018/10/09/puppet_data_exhaust/). Google is shutting down Google+ following massive data exposure (https://www.engadget.com/2018/10/08/google-shutting-down-google-plus/) - “90 percent of Google+ user sessions last for less than five seconds.” be like google prd mgmt desertion effect other enterprise props? legacy services OpenOffice watch (https://www.theregister.co.uk/2018/10/10/apache_open_office_not_dead/) - ‘Back in 2015, Red Hat developer Christian Schaller called OpenOffice "all but dead."’ Austin Ernest says make sure you don’t cargo cult The SRE (https://theagileadmin.com/2018/10/02/sre-the-biggest-lie-since-kanban/). The Demise of Blockbuster, and Other Failure Fairy Tales (https://medium.com/s/story/how-blockbuster-kodak-and-xerox-really-failed-its-not-what-you-think-e0a8c12e863d) - Strategy is hard, execution at the middle-management later is harder. Put yet another way, company executives have a lot less power than you’d think. Related: WTF is “culture” (https://twitter.com/cote/status/1050246624881061889)? This week in IPOs: Elastic has a party, Solarwinds figuring one out (https://www.channele2e.com/business/finance/solarwinds-ipo-plan-update/). Elastic (https://www.cnbc.com/2018/10/05/elastic-estc-ipo-stock-makes-debut-on-nyse.html): “The stock closed at $70 per share, representing a 94.4 percent rise.” Close of market on Oct. 10th (https://finance.yahoo.com/quote/ESTC?p=ESTC): $62.50 per share. 451 on Elastic revenue, Scott Denne (https://blogs.the451group.com/techdeals/ipo/elastic-adds-spring-to-the-fall-ipo-market/): “The developer of open source search software for IT log analysis, security analytics and other applications nearly doubled its top line in its fiscal year (ending April 30) to $160m, up from $88m a year earlier, while increasing the share of subscription revenue in its mix.” More: “Judging by Elastic’s offering, the [Q3] dry spell had little impact on investor appetites, setting up a favorable environment for Anaplan and SolarWinds as both look to price this month.” 451 on Elastic’s product, Nancy Gohring: “One of the most important messages that emerged from ElasticOn is that Elastic is positioning its software to serve as a platform for collecting and analyzing a wide array of machine data that can be used in a variety of use cases. With its recently announced APM UI and the forthcoming Infra UI, as well as the Canvas visualization capabilities, SQL-like querying and advancing machine-learning techniques, the Elastic Stack will be usable as a centralized platform for collecting and analyzing logs, events and metrics by constituents within a business including IT ops, security, executive leadership, product management and others.” So, Elastic is…an OSS (presumably) cheaper Splunk, but for general search not just IT? Or, wait, it is just IT stuff? Solarwinds: Coté hasn’t been able to parse out the Solarwinds deal. The big question is/will be, “so, did it make sense to go private, or could that have done whatever they’re doing by staying public?” Serverless and FaaS, survey shows confusion (https://thenewstack.io/add-it-up-serverless-faas/): “Despite attempts to educate the market, we still believe the word “serverless” connotes many different things, especially for the 79 percent of organizations that plan to adopt serverless architecture but have not planned to use FaaS in the next 18 months.” Coté’s old saw that “serverless” has just come to mean “doing programming on-top of cloud shit.” This is what Pivotal usually means when they say “cloud native,” versus the container kids who mean just “kubernetes,” at broadest, “containers.” Cloudera/Hortonworks follow-up: TPM (https://www.nextplatform.com/2018/10/05/hadoop-needs-to-be-a-business-not-just-a-platform/): “Cloudera has raked in $1.28 billion in revenues in the past six and a half years, while Hortonworks only brought in $808 million. Add in the venture capital of $1.31 billion in venture capital, plus $225 million that Cloudera raised in early 2017 for its IPO and the $100 million that Hortonworks raised in late 2014 from its IPO, and the total pile of cash that has come to the pair is $3.69 billion. Hortonworks still has $86 million of cash and Cloudera still has $440.1 million. But over that same time period, Cloudera has booked cumulative losses of $1.19 billion and Hortonworks has cumulative losses of $979 million, for a total of $2.16 billion. Both separately and together, these companies are burning the wood a lot faster than they can cut it.” TPM’s TAM summary, as suggested by the two companies: “The core market that Hadoop is chasing is comprised of three different segments, according to Cloudera-Hortonworks, and will grow at a compound annual growth rate of 21 percent between 2017 and 2022, from $12.7 billion to $32.3 billion. Within that, cognitive and artificial intelligence workloads represent a $14.3 billion opportunity in 2022, $4.9 billion for advanced and predictive analytics software, and $13.2 billion for dynamic data management systems (what we would call modern storage). In addition to that, the Hadoop platform is also chasing relational and non-relational database management systems and data warehouses, which is another $51 billion opportunity in 2022, for a total TAM of $83 billion. Even a small slice of this, which is what Hadoop currently gets today, could be billions of dollars by then.” Forrester on TAM penetration, Noel Yuhanna (https://go.forrester.com/blogs/cloudera-and-hortonworks-merger-a-win-win-for-all/): “We estimate that [just] 7% of organizations have completely migrated their traditional data warehouses to big data platforms. “ That’s 93% more left, assuming 20% capture for a leader, (shoddy percentage math follows)17 to 18%, I guess? Meanwhile, also from Forrester (https://www.forrester.com/report/Digital+Insights+Are+The+New+Currency+Of+Business/-/E-RES119109): “While 74% of global data and analytics decision makers tell us they will have invested in a big data lake by the end of 2017, we find that many of these are being kept on life support by the technology management shops that drove them.” Also, Forrester on HARK (Hadoop & Spark), Noel Yuhanna & Mike Gualtieri (https://www.forrester.com/report/Now+Tech+HadoopSpark+Platforms+Q3+2018/-/E-RES142699): “Distributed computing software and services that are rooted in open source Apache Hadoop and Apache Spark to store, process, and analyze data to find and use insights to improve customer experiences, create timely business intelligence, optimize business processes, and make decision making smarter and faster.” Like traditional analytics, but bigger and with more ML? 451 (Matt Aslett & James Curtis) (https://clients.451research.com/reportaction/95775/Toc?SearchTerms=Cloudera): “Although there are cross-selling opportunities and the two companies share an underlying open source foundation, there are also significant areas of product overlap and competing functionality, as well as a history of animosity to overcome.” Tamped down TAM: “Another way of looking at this is that the Hadoop market hasn't expanded enough to support the growth targets of two independent publicly traded companies, especially with the cloud providers to contend with.” Cloudera is the winner: “While the deal is being described by the companies as a merger, make no mistake that Cloudera is acquiring Hortonworks. After the transaction closes, Cloudera shareholders will own approximately 60% of the combined company, which will do business as Cloudera, with Hortonworks shareholders owning approximately 40%.” Products, Hortnworks: “Its primary product is the Hortonworks Data Platform (HDP), which consists of core Hadoop and some 20+ open source projects. But in August 2015, the company purchased Onyara, which was based on the Apache NiFi technology, and designed to enable users to collect, process and distribute data.” Products, Cloudera: “To date, Cloudera offers several products and while Hortonworks has adopted a pure 100% open source approach. Cloudera has a hybrid strategy, mixing open source with its proprietary tooling. The company's core offering is the Cloudera Enterprise Data Hub (CDH) – specifically targeted products are provided for data warehousing, operational database, and data science and engineering. Its cloud offering is Altus, a PaaS available on AWS and Azure.” 451 in another report (Agatha Poon) (https://clients.451research.com/reportaction/95135/Toc?SearchTerms=Cloudera), on Cloudera, June 2018: “At present, data analytics tools and offerings are driving regional opportunities with enterprises slowly but clearly moving out from legacy data warehouse platform to a new generation of data analytics platform, which is highly distributed and open standards based, Cloudera says. For machine learning and advanced data analytics, the company believes that data scientists will be the main users and strategic partners to boost future uptake. While data scientists can make use of algorithms to train the model into production data clusters, it could be a time-consuming and complex endeavor. With that in mind, Cloudera has stepped up its game by acquiring applied machine learning research startup Fast Forward Labs in late 2017, deepening its expertise in applying machine learning to practical business problems. The bigger Cloudera says it is committed to researching new techniques to resolve real-world business problems, building codes as well as providing customers with machine learning advisory services leveraging Fast Forward Labs' domain expertise.” Cloudera strategy: “Cloudera's proposition remains largely unchanged: lead machine learning in the enterprise, disrupt the data warehouse market for analytical and operational data workloads, capitalize on cloud adoption and drive innovation for simplification while mitigating data security risk. With cloud being an agent for digital transformation, the company has publicly announced its intent to lead with cloud innovation as part of the future growth strategy at the company level.” Conferences, et. al. Oct 16th - DevOpsDays Paris (https://www.devopsdays.org/events/2018-paris/welcome/) - Coté at a table. Pivotal will have a raffle! Oct 17th - JDriven Managers summit (https://www.jdriven.com/events/) - near Amsterdam - Coté talking. Oct 22nd - Cloud Native tour in Milan, Italy (https://connect.pivotal.io/milan_cloud_native_advocate_22oct.html). Coté and friends: a half day, a summit on Spring, DevOps, and cloud native programming. Free. Oct 31st - Coté speaking at New Relic’s FutureStack Amsterdam (https://web.cvent.com/event/23ce37e7-6077-42f5-8015-4a47a0cee30d/summary). Nov 3rd to Nov 12th - SpringOne Tour (https://springonetour.io/) - all over the earth! Coté will be MC’ing Beijing Nov 3rd, Seoul Nov 8th, Tokyo Nov 6th, and Singapore Nov 12th (https://springonetour.io/2018/singapore). Nov 14th to 16th - Devoxx Belgium (https://devoxx.be/), Antwerp. Coté’s presenting on enterprise architecture (https://dvbe18.confinabox.com/talk/ASN-9274/Rethinking_enterprise_architecture_for_DevOps,_agile,_&_cloud_native_organizations). Dec 12th and 13th - SpringTour Toronto (http://springonetour.io/2018/toronto), Coté. Nonsense Costco sought to provide a streaming service to customers (https://www.axios.com/costco-streaming-service-media-walmart-63c67545-67ef-4725-861f-fb70d285eb69.html). Listener Feedback Jermey is professor at a university in Chicago teaching cloud native and "devops" technologies to undergrads. “The Podcast has been a great benefit to the students. Could I get a few stickers to pass out to them?” SDT news & hype Join us in Slack (http://www.softwaredefinedtalk.com/slack) - new #upvoteplease channel for shameless (self) promotion. Subscribe to Software Defined Interviews Podcast (http://www.softwaredefinedinterviews.com/) - Cote on Tech Evangelism (http://www.softwaredefinedinterviews.com/75) CashedOut.coffee podcast (http://www.cashedout.coffee/). Send your postal address to stickers@softwaredefinedtalk.com (mailto:stickers@softwaredefinedtalk.com) and we will send you a sticker. Brandon built the Quick Concall iPhone App (https://itunes.apple.com/us/app/quick-concall/id1399948033?mt=8) and he wants you to buy it for $0.99. Recommendations Brandon: Dr. Foster (https://www.netflix.com/title/80097034) on Neflix (https://www.netflix.com/title/80097034) Matt: Slint documentary Breadcrumb Trail (https://www.youtube.com/watch?v=GsRpS6XGiOs&t=). Coté: micro.blog (https://micro.blog/), where Coté now has cote.coffee (http://cote.coffee/) hooked up with some Instagram and Pinboard IFTTT wingdings. Drafts 5 seems fine. Coté needs help figuring out WTF “culture” is from a practical angle (https://twitter.com/cote/status/1050246624881061889).
Big Data News at the end of the summer is not easy to find, but we did end up with three topics to discuss: from isolating GPUs in Hadoop 3.x to replicating big data (to the cloud) and quick tips from Adam's blog. Breaking News First Class GPUs support in Apache Hadoop 3.1, YARN & HDP 3.0 https://hortonworks.com/blog/gpus-support-in-apache-hadoop-3-1-yarn-hdp-3/ Replicating big datasets in the cloud https://medium.com/hotels-com-technology/replicating-big-datasets-in-the-cloud-c0db388f6ba2 https://dataworkssummit.com/berlin-2018/session/tools-and-approaches-for-migrating-big-datasets-to-the-cloud/ https://www.slideshare.net/Hadoop_Summit/tools-and-approaches-for-migrating-big-datasets-to-the-cloud Quick Tip: The easiest way to grab data out of a web page in Python https://medium.com/@ageitgey/quick-tip-the-easiest-way-to-grab-data-out-of-a-web-page-in-python-7153cecfca58 Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
I this weeks edition of Roaring Big Data News, Dave talks about modernizing Hadoop and a billion java errors. Jhon has an article on improving your learning data sets. We finish with a discussion about the newly released HDP 2.6.5 with an emphasis on the deprecation notices and Yarn Containers. Breaking News Dave Modernizing Hadoop: Reaching the plateau of productivity https://www.zdnet.com/article/modernizing-hadoop-reaching-the-plateau-of-productivity/ 1 billion Java errors, here’s what causes 97% of them https://blog.takipi.com/we-crunched-1-billion-java-logged-errors-heres-what-causes-97-of-them/ https://blog.takipi.com/the-top-10-exceptions-types-in-production-java-applications-based-on-1b-events/ Jhon Why you need to improve your training data, and how to do it https://petewarden.com/2018/05/28/why-you-need-to-improve-your-training-data-and-how-to-do-it/amp/ Announcing the General Availability of Hortonworks Data Platform (HDP) 2.6.5, Apache Ambari 2.6.2 and SmartSense 1.4.5 https://hortonworks.com/blog/announcing-general-availability-hortonworks-data-platform-hdp-2-6-5-apache-ambari-2-6-2-smartsense-1-4-5/ Component Versions https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_release-notes/content/comp_versions.html Deprecation Notices https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_release-notes/content/deprecated_items.html YARN Containers Trying out Containerized Applications on Apache Hadoop YARN 3.1 https://hortonworks.com/blog/trying-containerized-applications-apache-hadoop-yarn-3-1/ Containerized Apache Spark on YARN in Apache Hadoop 3.1 https://hortonworks.com/blog/containerized-apache-spark-yarn-apache-hadoop-3-1/ Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
How do we educate the new, younger workforce coming into Oil and Gas? Mark and Guest-host, Paige Wilson, had a chance to talk with Mark Mathis, President at Clear Energy Alliance during our monthly happy hour about how he got started in making documentaries on Oil and Gas and short videos on energy. "We make short-run videos, on a wide variety of topics, that are not just oil and gas, it's renewables, it's climate change, it's anything connected to energy, we will do a video on it. Generally 4-4.5 minutes" says Mark Mathis of Clear Energy Alliance. "We can educate people in small bites along the way, get them engaged in it and interested, and then have them come back for another bite on the next video." Click Play to Hear the Oil and Gas HSE Podcast Episode 83 – Clear Energy Alliance Upcoming Events OGGN's Monthly Happy Hour: June 26th at TechSpace, 2101 CityWest Blvd., from 6:00 - 9:00 pm. There will be a giveaway of $250 cash at the event for the person who refers the most friends to sign up! Enter here. Click here to register! A special THANK YOU to this month's Happy Hour sponsors: Hortonworks and ReactWell Hortonworks is a leading innovator in the industry, creating, distributing and supporting enterprise-ready open data platforms and modern data applications. Their mission is to manage the world's data. They have a single-minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. ReactWell provides advanced technology services and products for organizations in the energy, chemical, oil and gas and petrochemical industry, doing business inside and out of the United States of America. ReactWell achieves material quality results for their clients by blending creative solutions, constrained by the laws of hard science and scarce resources. Enter to Win! To get your hands on one of these awesome offshore bags, all you have to do is enter! Follow the link below and select Oil and Gas HSE and enter your information. We pick one lucky winner each week. Click Here to Enter More Information To find out more about Clear Energy Alliance, you can find them at https://clearenergyalliance.com/ Like Clear Energy Alliance on Facebook. Check out some of Clear Energy Alliance videos on YouTube. Follow Clear Energy Alliance on Twitter @clearenergy Connect with Mark Mathis on Linkedin.
How do we educate the new, younger workforce coming into Oil and Gas? Mark and Guest-host, Paige Wilson, had a chance to talk with Mark Mathis, President at Clear Energy Alliance during our monthly happy hour about how he got started in making documentaries on Oil and Gas and short videos on energy. "We make short-run videos, on a wide variety of topics, that are not just oil and gas, it's renewables, it's climate change, it's anything connected to energy, we will do a video on it. Generally 4-4.5 minutes" says Mark Mathis of Clear Energy Alliance. "We can educate people in small bites along the way, get them engaged in it and interested, and then have them come back for another bite on the next video." Click Play to Hear the Oil and Gas HSE Podcast Episode 83 – Clear Energy Alliance Upcoming Events OGGN's Monthly Happy Hour: June 26th at TechSpace, 2101 CityWest Blvd., from 6:00 - 9:00 pm. There will be a giveaway of $250 cash at the event for the person who refers the most friends to sign up! Enter here. Click here to register! A special THANK YOU to this month's Happy Hour sponsors: Hortonworks and ReactWell Hortonworks is a leading innovator in the industry, creating, distributing and supporting enterprise-ready open data platforms and modern data applications. Their mission is to manage the world’s data. They have a single-minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. ReactWell provides advanced technology services and products for organizations in the energy, chemical, oil and gas and petrochemical industry, doing business inside and out of the United States of America. ReactWell achieves material quality results for their clients by blending creative solutions, constrained by the laws of hard science and scarce resources. Enter to Win! To get your hands on one of these awesome offshore bags, all you have to do is enter! Follow the link below and select Oil and Gas HSE and enter your information. We pick one lucky winner each week. Click Here to Enter More Information To find out more about Clear Energy Alliance, you can find them at https://clearenergyalliance.com/ Like Clear Energy Alliance on Facebook. Check out some of Clear Energy Alliance videos on YouTube. Follow Clear Energy Alliance on Twitter @clearenergy Connect with Mark Mathis on Linkedin. Listen to Mark on OGIL. Leave a Review Help your oil and gas peers find the Oil and Gas HSE Podcast by leaving us a review on iTunes. The more, and better our reviews, the easier we are to find in iTunes, so help the industry out by leaving us a short review. Leave us a review by clicking here. If you would like some help leaving a review on iTunes the folk at HubSpot put together some easy to follow instructions that you can check out by clicking here. Upcoming Events Red Wing's Oil and Gas HSE Podcast is hitting the road. Our travel is made possible by our On The Road Sponsors: Here are all of the upcoming events we will be attending: IDT Expo 2018: June 28th at Norris Conference Center from 8AM-3:45PM. This Is Your Show Tell Mark and Patrick what topics you would like to hear discussed on the show! Click Here to Email Us Global Oil and Gas Network LinkedIn Group Join the conversation with some of the most influential people working in the oil and gas industry! Click Here to Join Free Resources for Our Audience Get Mark's Oil and Gas Events Newsletter. Digital marketing audit of your website. The Oil and Gas Global Network of Podcasts Oil and Gas Global Network | oilandgasglobalnetwork.com Oil and Gas This Week | oilandgasthisweek.com Oil and Gas Industry Leaders | oilandgasindustryleaders.com Connect With Us Patrick Pistor | Twitter | LinkedIn | Email |Facebook |leanoilfield.com Mark LaCour | Twitter | LinkedIn | Email |Facebook | modalpoint.com Clear Energy Alliance on Red Wing's Oil and Gas HSE Podcast - OGHSE083
In this episode, Paige sits with Jack Hinton at The Capital Grille CityCentre to discuss his journey in the Oil and Gas Industry to his current role as a Chief HSE Officer at Baker Hughes a GE Company. Reach out to Jack and learn more about Baker Hughes a GE Company. Leave a Review Enjoy listening? Support the show by leaving a review in iTunes. Sign Up and Win Click here to sign up here to win a FR Shirt and FR Base Layer from Bulwark! 2018 Event Sponsors OGGN is always accepting event sponsors. If you would like to get your company in front of our large global audience, reach out to us and we would be happy to share the details. Events on Deck OGGN's Monthly Happy Hour: June 26th at TechSpace, 2101 CityWest Blvd., from 6:00 – 9:00 pm. There will be a giveaway of $250 cash at the event for the person who refers the most friends to sign up! Enter here. Click here to register! A special THANK YOU to this month's Happy Hour sponsors: Hortonworks and ReactWell Hortonworks is a leading innovator in the industry, creating, distributing and supporting enterprise-ready open data platforms and modern data applications. Their mission is to manage the world's data. They have a single-minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. ReactWell provides advanced technology services and products for organizations in the energy, chemical, oil and gas and petrochemical industry, doing business inside and out of the United States of America. ReactWell achieves material quality results for their clients by blending creative solutions, constrained by the laws of hard science and scarce resources. IDT Expo 2018 - Come and say hi to the OGGN crew on Thursday, June 28th at the Norris Conference Center. More Oil and Gas Global Network Podcasts Oil and Gas This Week Podcast | Oil and Gas HS&E Podcast Engage with Oil and Gas Global Network LinkedIn Group | Facebook | modalpoint | Lean Oilfield | WellHub David Studio Emin is OGGN's Professional Audio Editor for all of our shows. If you're interested in services, send an e-mail with OGGN in the subject to receive $5 off. Connect with Paige Wilson LinkedIn | Twitter | E-Mail | Oil and Gas Global Network Jack Hinton on Oil and Gas Industry Leaders Podcast - OGIL038
In this episode, Paige sits with Jack Hinton at The Capital Grille CityCentre to discuss his journey in the Oil and Gas Industry to his current role as a Chief HSE Officer at Baker Hughes a GE Company. Reach out to Jack and learn more about Baker Hughes a GE Company. Leave a Review Enjoy listening? Support the show by leaving a review in iTunes. Sign Up and Win Click here to sign up here to win a FR Shirt and FR Base Layer from Bulwark! 2018 Event Sponsors OGGN is always accepting event sponsors. If you would like to get your company in front of our large global audience, reach out to us and we would be happy to share the details. Events on Deck OGGN’s Monthly Happy Hour: June 26th at TechSpace, 2101 CityWest Blvd., from 6:00 – 9:00 pm. There will be a giveaway of $250 cash at the event for the person who refers the most friends to sign up! Enter here. Click here to register! A special THANK YOU to this month’s Happy Hour sponsors: Hortonworks and ReactWell Hortonworks is a leading innovator in the industry, creating, distributing and supporting enterprise-ready open data platforms and modern data applications. Their mission is to manage the world’s data. They have a single-minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. ReactWell provides advanced technology services and products for organizations in the energy, chemical, oil and gas and petrochemical industry, doing business inside and out of the United States of America. ReactWell achieves material quality results for their clients by blending creative solutions, constrained by the laws of hard science and scarce resources. IDT Expo 2018 - Come and say hi to the OGGN crew on Thursday, June 28th at the Norris Conference Center. More Oil and Gas Global Network Podcasts Oil and Gas This Week Podcast | Oil and Gas HS&E Podcast Engage with Oil and Gas Global Network LinkedIn Group | Facebook | modalpoint | Lean Oilfield | WellHub David Studio Emin is OGGN’s Professional Audio Editor for all of our shows. If you’re interested in services, send an e-mail with OGGN in the subject to receive $5 off. Connect with Paige Wilson LinkedIn | Twitter | E-Mail | Oil and Gas Global Network Jack Hinton on Oil and Gas Industry Leaders Podcast - OGIL038
What is the root cause of accidents, is it one person's fault? Or the culture as a whole? Mark and Guest-host, Paige Wilson, had a chance to talk with Steven Riddle, Global Operations Integrity at ExxonMobil during OTC 2018 about how implementing change in culture will drive for change within the industry. "So what we as an industry need to do is understand within our cultures, we need to understand why, the why people make choices. Understand the why behind it, you know what empowers them to make the right choice, the safe choice" says Steven Riddle, Global Operations Integrity at ExxonMobil. Click Play to Hear the Oil and Gas HSE Podcast Episode 82 – Exxonmobil at OTC 2018 Upcoming Events OGGN's Monthly Happy Hour: June 26th at TechSpace, 2101 CityWest Blvd., from 6:00 - 9:00 pm. There will be a giveaway of $250 cash at the event for the person who refers the most friends to sign up! Enter here. Click here to register! A special THANK YOU to this month's Happy Hour sponsors: Hortonworks and ReactWell Hortonworks is a leading innovator in the industry, creating, distributing and supporting enterprise-ready open data platforms and modern data applications. Their mission is to manage the world's data. They have a single-minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. ReactWell provides advanced technology services and products for organizations in the energy, chemical, oil and gas and petrochemical industry, doing business inside and out of the United States of America. ReactWell achieves material quality results for their clients by blending creative solutions, constrained by the laws of hard science and scarce resources. Enter to Win! To get your hands on one of these awesome offshore bags, all you have to do is enter! Follow the link below and select Oil and Gas HSE and enter your information. We pick one lucky winner each week. Click Here to Enter More Information To find out more about Exxonmobil, you can find them at http://corporate.exxonmobil.com/ Connect with Exxonmobil on LinkedIn. Like Exxonmobil on Facebook. Check out some of Exxonmobil's videos on YouTube. Follow Exxonmobil on Twitter @exxonmobil Connect with Steven Riddle on Linkedin. Leave a Review Help your oil and gas peers find the Oil and Gas HSE Podcast by ...
What is the root cause of accidents, is it one person's fault? Or the culture as a whole? Mark and Guest-host, Paige Wilson, had a chance to talk with Steven Riddle, Global Operations Integrity at ExxonMobil during OTC 2018 about how implementing change in culture will drive for change within the industry. "So what we as an industry need to do is understand within our cultures, we need to understand why, the why people make choices. Understand the why behind it, you know what empowers them to make the right choice, the safe choice" says Steven Riddle, Global Operations Integrity at ExxonMobil. Click Play to Hear the Oil and Gas HSE Podcast Episode 82 – Exxonmobil at OTC 2018 Upcoming Events OGGN's Monthly Happy Hour: June 26th at TechSpace, 2101 CityWest Blvd., from 6:00 - 9:00 pm. There will be a giveaway of $250 cash at the event for the person who refers the most friends to sign up! Enter here. Click here to register! A special THANK YOU to this month's Happy Hour sponsors: Hortonworks and ReactWell Hortonworks is a leading innovator in the industry, creating, distributing and supporting enterprise-ready open data platforms and modern data applications. Their mission is to manage the world’s data. They have a single-minded focus on driving innovation in open source communities such as Apache Hadoop, NiFi, and Spark. ReactWell provides advanced technology services and products for organizations in the energy, chemical, oil and gas and petrochemical industry, doing business inside and out of the United States of America. ReactWell achieves material quality results for their clients by blending creative solutions, constrained by the laws of hard science and scarce resources. Enter to Win! To get your hands on one of these awesome offshore bags, all you have to do is enter! Follow the link below and select Oil and Gas HSE and enter your information. We pick one lucky winner each week. Click Here to Enter More Information To find out more about Exxonmobil, you can find them at http://corporate.exxonmobil.com/ Connect with Exxonmobil on LinkedIn. Like Exxonmobil on Facebook. Check out some of Exxonmobil's videos on YouTube. Follow Exxonmobil on Twitter @exxonmobil Connect with Steven Riddle on Linkedin. Leave a Review Help your oil and gas peers find the Oil and Gas HSE Podcast by leaving us a review on iTunes. The more, and better our reviews, the easier we are to find in iTunes, so help the industry out by leaving us a short review. Leave us a review by clicking here. If you would like some help leaving a review on iTunes the folk at HubSpot put together some easy to follow instructions that you can check out by clicking here. Upcoming Events Red Wing's Oil and Gas HSE Podcast is hitting the road. Our travel is made possible by our On The Road Sponsors: Here are all of the upcoming events we will be attending: TBD This Is Your Show Tell Mark and Patrick what topics you would like to hear discussed on the show! Click Here to Email Us Global Oil and Gas Network LinkedIn Group Join the conversation with some of the most influential people working in the oil and gas industry! Click Here to Join Free Resources for Our Audience Get Mark's Oil and Gas Events Newsletter. Digital marketing audit of your website. The Oil and Gas Global Network of Podcasts Oil and Gas Global Network | oilandgasglobalnetwork.com Oil and Gas This Week | oilandgasthisweek.com Oil and Gas Industry Leaders | oilandgasindustryleaders.com Connect With Us Patrick Pistor | Twitter | LinkedIn | Email |Facebook |leanoilfield.com Mark LaCour | Twitter | LinkedIn | Email |Facebook | modalpoint.com BHGE at OTC 2018 on Red Wing's Oil and Gas HSE Podcast - OGHSE082
Almost a year ago, Patricia Florissi Blogged about a Distributed Data Analytics Platform and the World Wide Herd. The concept of creating a global network of distributed Apache Hadoop instances forming a single virtual computing cluster. The concept opens the door to an amazing amount of possibilities, orchestrating computations across the globe from edge to core to cloud. I caught up with Ahmed Osama, (@Ahmed_OZ) Sr Innovation Manager, and Lori Schlesman (@LoriSchles) Lead Program Manager for the WWH Project, to find out what our customers have been using the platform for various use cases. For more information drop an email to federatedanalytics@Dell.com! Get Dell EMC The Source app in the Apple App Store or Google Play, and Subscribe to the podcast: iTunes, Stitcher Radio or Google Play. Dell EMC The Source Podcast is hosted by Sam Marraccini (@SamMarraccini)
The first news episode of 2018 has landed. We discuss the new Big Data architecture at CERN, a curious case of a broken benchmark and the future plans of the Apache Hadoop project. Breaking News The Architecture of the Next CERN Accelerator Logging Service https://databricks.com/blog/2017/12/14/the-architecture-of-the-next-cern-accelerator-logging-service.html The Curious Case of the Broken Benchmark: Revisiting Apache Flink® vs. Databricks Runtime https://data-artisans.com/blog/curious-case-broken-benchmark-revisiting-apache-flink-vs-databricks-runtime Hadoop 3.0 Ships, But What Does the Roadmap Reveal? https://www.datanami.com/2017/12/15/hadoop-3-0-ships-roadmap-reveal/ Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
It's here: the final news episode for 2017! We finish off the year talking about Apache Pulsar, Hadoop Delegation tokens (aka Kerberos), the Hadoop on Container hype (or is it?), Apache Hadoop 3.0 release and all you need to know bout Data Prepping (or at least all we can tell you in about 10 minutes, that is). Breaking News Jhon Comparing Pulsar and Kafka: unified queuing and streaming https://streaml.io/blog/pulsar-streaming-queuing/ Hadoop Delegation Tokens Explained http://blog.cloudera.com/blog/2017/12/hadoop-delegation-tokens-explained/ Hadoop and Containers Big Data and Container Orchestration with Kubernetes (K8s) https://www.bluedata.com/blog/2017/12/big-data-container-orchestration-kubernetes-k8s/ Spark on Kubernetes series https://banzaicloud.com/blog/spark-k8s/ https://banzaicloud.com/blog/scaling-spark-k8s/ https://banzaicloud.com/blog/zeppelin-spark-k8/ Data Prepping in the clouds Google Cloud Dataprep: Spreadsheet-Style Data Wrangling Powered by Google Cloud Dataflow https://medium.com/mark-rittman/google-cloud-dataprep-spreadsheet-style-data-wrangling-powered-by-google-cloud-dataflow-a48c405d81c Data Transformations “By Example” in the Azure Machine Learning Workbench https://blogs.technet.microsoft.com/machinelearning/2017/09/25/by-example-transformations-in-the-azure-machine-learning-workbench/ Dave Hadoop 3.0 Released on December 13th 2017 http://hadoop.apache.org/docs/r3.0.0/index.html http://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/release/3.0.0/RELEASENOTES.3.0.0.html http://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-common/release/3.0.0/CHANGES.3.0.0.html Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
Customers are migrating their analytics, data processing (ETL), and data science workloads running on Apache Hadoop, Spark, and data warehouse appliances from on-premise deployments to AWS in order to save costs, increase availability, and improve performance. AWS offers a broad set of analytics services, including solutions for batch processing, stream processing, machine learning, data workflow orchestration, and data warehousing. This session will focus on identifying the components and workflows in your current environment; and providing the best practices to migrate these workloads to the right AWS data analytics product. We will cover services such as Amazon EMR, Amazon Athena, Amazon Redshift, Amazon Kinesis, and more. We will also feature Vanguard, an American investment management company based in Malvern, Pennsylvania with over $4.4 trillion in assets under management. Ritesh Shah, Sr. Program Manager for Cloud Analytics Program at Vanguard, will describe how they orchestrated their migration to AWS analytics services, including Hadoop and Spark workloads to Amazon EMR. Ritesh will highlight the technical challenges they faced and overcame along the way, as well as share common recommendations and tuning tips to accelerate the time to production.
Dr. Manjeet Rege and Dan Yarmoluk speak with Sean Owen, Director of Data Science of Cloudera. Before Cloudera, Sean founded Myrrix Ltd (now, the Oryx project) to commercialize large-scale real-time recommender systems on Apache Hadoop. He is an Apache Spark Committer and co-authored Advanced Analytics on Spark. He was a Committer and VP for Apache Mahout, and co-author of Mahout in Action. Previously, Sean was a senior engineer at Google. The conversation moves from Cloudera solutions, data science skills, Altus as how to harness Big Data in the ever-changing landscape of technology.
Today Francesc and Mark have the honor to be joined by Alim Jaffer and Mo Firouz from Heroic Labs to discuss their open source framework for social and realtime apps and games. About Alim Jaffer A member of the founding team, Alim joined Heroic Labs in 2016 as the VP of Product after having worked in startups focused in the games and health verticals. He is based in Vancouver, Canada and San Francisco. About Mo Firouz Mo cofounded Heroic Labs and is part of the core engineering team. Mo has worked on various products in Heroic Labs including the core Nakama server as well as Heroic Managed Cloud where he was primarily responsible for automating server provisioning and the monitoring stack with Kubernetes. Mo previously worked as a system architect in VisualDNA and built scalable big-data analytics systems, and prior to that built realtime high frequency trading systems. Cool things of the week CRE life lessons: What is a dark launch, and what does it do for me? blog post Cloud Dataproc is now even faster and easier to use for running Apache Spark and Apache Hadoop announcement Canary Deployments using Istio blog post Interview Heroic Labs heroiclabs.com Heroic Labs on GitHub repository Heroic Labs Documentation Google Container Engine CockroachDB Question of the week Accessing Cloud SQL instances from Cloud Functions? Use SQL Proxy, as for the Managed Instance Group, which we cover on episode 81. Connecting MySQL Client from Compute Engine About the Cloud SQL Proxy CloudSQL Proxy GitHub repo Where can you find us next? Francesc just released a justforfunc episode on Contributing to the Go project. He'll be soon taking some well deserved holidays! Mark will be speaking at Pax Dev and then attending Pax West right after.
Apache BigData Retour sur Apache BigDataDataStax announces availability of ‘white glove’ managed cloud servicehttp://diginomica.com/2017/05/23/datastax-announces-availability-white-glove-managed-cloud-service/amp/CockroachDB 1.0 is Production-Readyhttps://www.cockroachlabs.com/blog/cockroachdb-1-0-release/Local and distributed query processing in CockroachDBhttps://www.cockroachlabs.com/blog/local-and-distributed-processing-in-cockroachdb/#Azure Cosmos DBhttps://speakerdeck.com/dharmashukla/azure-cosmos-db-lessons-learnt-from-building-a-globally-distributed-database-from-the-ground-uphttps://channel9.msdn.com/Events/Build/2017/KEY01#time=1h27m20shttps://softwareengineeringdaily.com/2017/06/01/cosmosdb-with-andrew-hoh/A Vision for Making Deep Learning Simplehttps://databricks.com/blog/2017/06/06/databricks-vision-simplify-large-scale-deep-learning.htmlSpark gets automation: Analyzing code and tuning clusters in productionhttp://www.zdnet.com/article/spark-gets-automation-analyzing-code-and-tuning-clusters-in-production/https://www.pepperdata.com/press-releases/pr_052317/What’s New in Hadoop 3.0 – Enhancements in Apache Hadoop 3https://www.edureka.co/blog/hadoop-3/Apache Flink® 1.3.0 and the Evolution of Stream Processing with Flinkhttps://data-artisans.com/blog/apache-flink-1-3-0-evolution-stream-processingYou are not Googlehttps://blog.bradfieldcs.com/you-are-not-google-84912cf44afbMaster time with Kibana’s new time series visual builderhttps://www.elastic.co/blog/master-time-with-kibanas-new-time-series-visual-builder?blade=twTeradata doubles downhttp://www.zdnet.com/google-amp/article/teradata-doubles-down/
Apache BigData Retour sur Apache BigDataDataStax announces availability of ‘white glove’ managed cloud servicehttp://diginomica.com/2017/05/23/datastax-announces-availability-white-glove-managed-cloud-service/amp/CockroachDB 1.0 is Production-Readyhttps://www.cockroachlabs.com/blog/cockroachdb-1-0-release/Local and distributed query processing in CockroachDBhttps://www.cockroachlabs.com/blog/local-and-distributed-processing-in-cockroachdb/#Azure Cosmos DBhttps://speakerdeck.com/dharmashukla/azure-cosmos-db-lessons-learnt-from-building-a-globally-distributed-database-from-the-ground-uphttps://channel9.msdn.com/Events/Build/2017/KEY01#time=1h27m20shttps://softwareengineeringdaily.com/2017/06/01/cosmosdb-with-andrew-hoh/A Vision for Making Deep Learning Simplehttps://databricks.com/blog/2017/06/06/databricks-vision-simplify-large-scale-deep-learning.htmlSpark gets automation: Analyzing code and tuning clusters in productionhttp://www.zdnet.com/article/spark-gets-automation-analyzing-code-and-tuning-clusters-in-production/https://www.pepperdata.com/press-releases/pr_052317/What’s New in Hadoop 3.0 – Enhancements in Apache Hadoop 3https://www.edureka.co/blog/hadoop-3/Apache Flink® 1.3.0 and the Evolution of Stream Processing with Flinkhttps://data-artisans.com/blog/apache-flink-1-3-0-evolution-stream-processingYou are not Googlehttps://blog.bradfieldcs.com/you-are-not-google-84912cf44afbMaster time with Kibana’s new time series visual builderhttps://www.elastic.co/blog/master-time-with-kibanas-new-time-series-visual-builder?blade=twTeradata doubles downhttp://www.zdnet.com/google-amp/article/teradata-doubles-down/
Data industry guru and analyst John Myers joins Phil Bowermaster for a discussion about the future of big data. Is big data, and Apache Hadoop in particular, experiencing some kind of existential crisis? Has Hadoop "failed us?" John describes how the crisis may be one of mismatched expectations, explaining that Hadoop is free in the same way that free puppy is free. Moreover, while Hadoop addresses the addresses the major challenges of the big data era, it does not automatically solve the myriad other data management problems that data warehouse and database systems have been addressing for years. So is Hadoop over, or is it just getting started? And what does the future of Hadoop have to tell us about the future of big data, and therefore the future of everything else? Tune in and explore. About Our Guest John Myers has nearly 20 years of experience in areas related to business analytics and business intelligence in professional services, sales consulting, product management, industry analysis and research. John is a frequent contributor to industry publications including Search Business Analytics, Inside Analysis and Information Management. He speaks internationally on the topics of telecom analytics, data virtualization and Big Data. John has been previously named as one of the Top 100 Big Data Influencers. He is a Managing Research Director of Business Intelligence and Analytics Practice with EMA. WT 294-603
IT Best Practices: Episode 97 – Big data and advanced analytics have become increasingly important to Intel. More and more people and business groups rely on Intel’s big data capabilities to maximize productivity. A recent step in the evolution of big data at the company was the move last year from the Intel Distribution for […]
IT Best Practices: Episode 97 – Big data and advanced analytics have become increasingly important to Intel. More and more people and business groups rely on Intel’s big data capabilities to maximize productivity. A recent step in the evolution of big data at the company was the move last year from the Intel Distribution for […]
IT Best Practices: Intel IT values open-source-based, big data processing using Apache Hadoop software. Last year we migrated from the Intel Distribution for Apache Hadoop software to the Cloudier Enterprise software. Based on our original experience with Apache Hadoop software, Intel IT identified new opportunities to reduce IT costs and extend our business intelligence capabilities. […]
With Doug Cutting, the chief architect of Cloudera and the co-founder of Apache Hadoop, we discussed his journey from a software engineer to a contributor and creator of several open source software which are widely used today. We also deep dived into Cloudera and Hadoop, and discussed various topics from talent shortage to how Hadoop The post Episode 79: Hadoop, Cloudera & Big Data with Doug Cutting appeared first on Analyse Asia.
Kito and Daniel discuss new releases from PrimeFaces, OpenWebBeans, DeltaSpike, Spring Boot, Polymer, AngularJS, WebAssembly, Play, Lucene, new JSF extensions, and more. They also discuss Microsoft’s open-source strategy and Visual Studio Code. Keep up with the alphabet soup of product names. Check out our technical glossary! UI Tier OmniFaces 2.1 released! New F12 Developer Tools for the New Microsoft Edge Liferay Faces Project News - May 2015 AngularJS + CDI = AngularBeans Web framework: Introducing Juzu version 1.0 and its brand new website Apache Tobago 2.0.8 Release Polymer 1.0 PrimeFaces introduces Rio theme and layout New Releases for PrimeFaces Layouts PrimeUI 2.0 Released Recent Ripple of JSF Extensions ButterFaces Material Prime Generjee JSR 378: Portlet 3.0 Bridge for JavaServerTM Faces 2.2 Specification WebAssembly: A Universal Binary and Text Format for the Web Play 2.4.0 “Damiya” released, adds new DI support and test APIs Persistence Tier [ANNOUNCE] Apache Lucene 5.2.0 released Services (Middleware & Microservices) Tier Oracle Developer Cloud Service 15.2.2 Released Spring for Apache Hadoop 2.2 GA released [ANN] End of life for Apache Tomcat 6.0.x [ANNOUNCEMENT] HttpComponents Client 4.5 GA Released [ANNOUNCE] Apache OpenWebBeans 1.6.0 [ANNOUNCE] Release of Apache DeltaSpike 1.4.0 Apache Allura 1.3.0 released [ANNOUNCE] Apache Flume 1.6.0 released [ANNOUNCE] Apache Calcite 1.3.0 (incubating) released [ANNOUNCE] Commons Email version 1.4 released Misc CRaSH Spring Boot 1.2.4 released Spring Social 1.1.2 Released JBoss Fuse 6.2 is out! Infinispan 7.2.3.Final Discussion JavaEE or Spring? Neither! We Call Out For a Fresh Competitor! Visual Studio Code, and Microsoft OSS will it affect Java Enterprise ? https://code.visualstudio.com/Download MS fork of NodeJS Events Javazone, Oslo - September 9-10, 2015 No Fluff Just Stuff Austin July 10 - 12, 2015 ÜberConf, Denver July 21 - 24, 2015 Raleigh August 21 - 22, 2015 SpringOne 2GX, Washington DC September 7-14, 2015 Atlanta September 18 - 20, 2015
Kito and Daniel cover new releases from HighFaces, GISFaces, Spring, Java SE/ME, WebSphere, Arquillian, Apache, and more. They also discuss Microsoft’s new Spartan browser and Pivotal’s decision to stop sponsoring Groovy and Grails. UI Tier HighFaces 1.0 GISFaces 1.4 Esprima 2.0 Released Project Avatar Update Microsoft’s new Spartan Browser Persistence Tier Spring XD 1.1 GA and 1.0.4 released Spring for Apache Hadoop 2.1 Released Spring Integration Kafka Extension 1.0.GA is available Infinispan 7.1.0 Final Released Services (Middleware & Microservices) Tier OmniSecurity IBM WebSphere Liberty Profile Beta with Tools - February Misc Arquillian Core 1.1.7.Final Released Apache Tika 1.7 Released HttpComponents Client 4.4 GA Released Apache Allura 1.2.0 released Git 2.3 has been released Java SE 8 update 31, SE 7 and SE Embedded Java ME Embedded 8.1 Released Discussion http://www.quora.com/Why-is-Pivotal-ending-the-sponsorship-of-Groovy-and-Grails http://www.businessinsider.com/ibm-layoffs-are-coming-but-nowhere-close-to-100000-2015-1 Events No Fluff Just Stuff Boston, MA Feb 27 - Mar 1 Minneapolis, MN Mar 6 - 8 Madison, WI Mar 13 - 14 San Diego, CA Mar 20 - 21 St. Louis, MO Apr 10 - 11 DevNexus - Atlanta, GA, USA Mar 10-12th, 2015 Philadelphia Emerging Tech - Philadelphia, PA April 7-8, 2015 Clojure West - Portland OR, USA April 20-22, 2015 Devoxx UK - London, UK Jun 17-19th Devoxx Poland - Krakow, Poland Jun 22-24, 2015
Les Cast Codeurs se retrouvent en cette nouvelle année pour parler des quelques nouvelles Java de 2015, pour faire une rétrospective de 2014 et philosopher sur 2015. Six jours après l’attentat contre Charlie Hebdo et ce qui a suivi, on ne pouvait pas ne pas aborder ce sujet. Enregistré le 13 janvier 2015 Téléchargement de l’épisode LesCastCodeurs-Episode–116.mp3 Bonne année à tous et merci à nos sponsors cette année Sfeir et CloudBees qui nous ont permis d’atteindre le numéro 100. News Maître Eolas whois est Charlie Anonymous est Charlie et pastebin Le juge anti-terroriste Marc Trevidic Langages “Scala, c’est le Perl des snobs” Java EE et middleware MVC basé sur JAX-RS Hibernate OGM est sorti Hibernate Search 5.0 est sorti Les booth babes ne fonctionnent pas PaaS et mobile Thales et la Caisse des Dép. décident de se désengager de Cloudwatt: CloudWatt 2m de C.A 2014 pour 150m investit, Cloudwatt passe sous le contrôle à 100% d’Orange Vulnérabilité dans Google App Engine Android Studio 1.0 Pourquoi il ne faut plus utiliser CyanogenMod, ou les dégâts d’un management de cour de récrée Infrastructure Le 30 juin, on pète internet La specification des images Docker Docker 1.4.0, plus propre que jamais et les videos en ligne de DockerCon EU Iliad lance un service cloud sur ARM Au delà du code Les objets connectés et notre vie privée Podcast l’économie en questions GitHub n’utilise que des formats diffables Comment Github utilise Github (Pages) pour sa doc Un bon product manager… TravisCI, From Open (Unlimited) to Minimum Vacation Policy L’année 2014, l’année 2015 Rétrospective Le big data (Apache Hadoop, Apache Spark JavaScript en hyperinflation (AngularJS 2) ReactiveX JavaPosse qui tire sa révérence Lambda et programmation fonctionnelle Java 8 sort Sécurité (truecrypt, shellshock, heartbleed, gotofail, Sony, …) La sécurité dans les outils DevoOps Larry Elisson La place des minorités dans la tech Apple et la baisse de la qualité du logiciel Rapport au gouvernement développeurs en France Microservices PaaS - ça se calme Prévisions 2015 Maturation de la Big Data Le hype du container et du micro service continue Bordel d’orchestration des micro et des containers API Asynchrones Sécurité Modularité de Java: mouahahahahah Rien dans le mobile La guerre des plateformes (Microsoft Azure, Google Services, Amazon WS) Outils de l’épisode Le marketing pour les startups tech 2400 Jeux DOS jouables via le navigateur L’horloge en couleur hexa BotBot.me, un service pour archiver et accéder en temps réel aux logs des chats IRC Conférences CFP ApacheCon Devoxx France du 8 au 10 avril à Paris - Fin du CFP le 17 Janvier. DevopsDays Paris du 14 au 15 avril à Paris MixIt du 16 au 17 avril à Lyon Le crowdcasting Nous contacter Contactez-nous via twitter http://twitter.com/lescastcodeurs sur le groupe Google http://groups.google.com/group/lescastcodeurs ou sur le site web http://lescastcodeurs.com/ Flattr-ez nous (dons) sur http://lescastcodeurs.com/ En savoir plus sur le sponsoring? sponsors@lescastcodeurs.com
Dans cet épisose, on discute avec Sam Bessalah de ce “nouveau” métier qu’est le data scientist. On explore aussi l’univers Apache Hadoop et l’univers Apache Mesos. Ces endroits sont pleins de projets aux noms bizarres, cette interview permet de s’y retrouver un peu dans cette mythologie. Enregistré le 16 decembre 2014 Téléchargement de l’épisode LesCastCodeurs-Episode–115.mp3 Interview Ta vie, ton oeuvre @samklr Ses présentations, encore ici et là Data scientist Kesako ?! C’est nouveau ? On a toujours eu des données pourtant dans nos S.I. ?! Le job le plus sexy du 21eme siecle ? Drew conway’s Data Science Venn diagram Traiter les données, les plateformes MapR, Hadoop, … C’est Quoi ? C’est nouveau ? Ca vient d’où ? Comment ça marche ? A quoi ça sert ? Ca s’intègre à tout ? Et nos sources de données legacy (Mon bon vieux mainframe et son EBCDIC) ? Où sont passés mes EAI, ETL, et autres outils d’intégration B2C/B2B ? EAI ETL EBCDIC BI (Business Intelligence) Hadoop MapReduce Doug Cutting Apache Lucene - moteur de recherche full-text Apache Hadoop - platforme de process distribués et scalables HDFS - système de fichier distribué Apache Hive - datawarehouse au dessus d’Hadoop offrant du SQL-like Terradata Impala - database analytique (“real time”) SQL queries etc Apache Tez - directed-acyclic-graph of tasks Apache Shark remplacé par Spark SQL Apache Spark - Spark has an advanced DAG execution engine that supports cyclic data flow and in-memory computing Apache Storm - process de flux de données de manière scalable et distribuée Data Flow Machine Learning - apprendre de la donnée Graph Lab Et l’infrastructure dans tout ça ? De nos bons vieux serveurs qui remplissent les salles machines au cloud (IAAS, PAAS), en passant par la virtualisation (), les conteneurs (XLC, Docker, …) …. Des ressources à gogo c’est bien mais comment les gérer ? YARN Apache Mesos Apache Mesos Comment démarrer Mesos Tutoriaux Data Center OS de Mesosphere Presentation de Same à Devoxx sur Mesos Mesos et les container docker Cluster Management and Containerization by Benjamin Hindman Integration continue avec Mesos par EBays Docker Docker Démarrer un cluster Spark avec Docker Shell Spark dans Docker Docker et Kubernetes dans Apache Hadoop YARN Cluster Hadoop sur Docker Docker, Kubernetes and Mesos cgroups LXC Docker vs LXC Marathon Chronos Code de Chronos Aurora Kubernetes Kubernetes workshop Oscar Boykin Scalding Présentation Scala + BigData et une autre Apache Ambari Comment je m’y mets ? Comment devient-on data scientist ? (se former, ouvrages de références, sources d’infos, …) Mesosphere Cours de Andrew Ng sur le Machine Learning Introduction to data science sur Coursera Kaggle MLlib Mahoot R Scikit-learn (Python) Machine Learning pour Hackers (livre) Scala TypeSafe Activator iPython NoteBooks Autres référence iPython NoteBooks Notebooks temporaires en line - démarre un container docker sur rackspace gratuitement (pour vous) Des notebooks Parallel Machine Learning with scikit-learn and IPython Visualiser les notebooks en ligne sans les télécharger Spark / Scala notebooks for web based spark development http://zeppelin-project.org/ Spark et Scala avec un notebook ipython Nous contacter Contactez-nous via twitter http://twitter.com/lescastcodeurs sur le groupe Google http://groups.google.com/group/lescastcodeurs ou sur le site web http://lescastcodeurs.com/ Flattr-ez nous (dons) sur http://lescastcodeurs.com/ En savoir plus sur le sponsoring? sponsors@lescastcodeurs.com
Kito, Ian, and Daniel cover new releases of AngularJS, PrimeFaces, MyFaces, Bootstrap, Hadoop, Spring Roo, Tomcat, Arquillian, Spring Framework, Spring Integration, Akka, Solr, Lucene, and more. They also discuss the forking of Node.js, microservices vs app servers, and a recent blog post about why you shouldn’t use JSF. UI Tier RichFaces 4.5.1.Final released MyFaces Core 2.2.6 released PrimeFaces Elite triple released AngularJS 1.3.0 Aria support Bootstrap 3.3.1 released Persistence Tier Apache Hadoop 2.5.0 is released Spring Roo 1.3.0 introduces JDK 8 support Spring for Apache Hadoop 2.0.3 Released Testing Arquillian OSGi 1.1.1.Final released Arquillian Cube Extension 1.0.0.Alpha1 Misc Spring Integration Java DSL 1.0 GA Released Spring Security OAuth 2.0.4.RELEASE Available Now Spring Framework 4.1.2 & 4.0.8 & 3.2.12 released Akka 2.3.7 Maintenance Release Apache Tomcat 6.0.43 released Release of Apache DeltaSpike 1.1.0 Apache Camel 2.13.3 Released Apache Solr 4.10.2 released Apache Lucene 4.10.2 released Node.js forked Discussion Is Middleware done for in favor of Microservices? Why You Should Avoid JSF Ian’s slides on integration JSF with front-end frameworks Events No Fluff Just Stuff Jfokus - Stockholm, Sweden Feb 2-4th, 2015 DevNexus - Atlanta, GA, USA (call for papers is open; Kito and Daniel will both be speaking) Mar 10-12th, 2015
Business Solutions for IT Managers: Whiteklay enhances performance of its BioDek distribution system and speeds up genome next-generation sequencing (NGS) with Intel Xeon processor E5 v2 family and CDH, a distribution of Apache Hadoop from Cloudera.
Business Solutions for IT Managers: Intel helps Next Media Ltd. develop a big data article recommendation engine based on CDH, a distribution of Apache Hadoop from Cloudera, to bring more relevant content to its readers.
Apache Solr real-time live index updates at scale with Apache Hadoop
Beyond MapReduce and Apache Hadoop 2.X with Bikas Saha and Arun Murthy
IT Best Practices: Episode 72 – Intel recognizes the value a big data platform can bring to Business Intelligence. Organizations like Intel are rich in data, but that value can’t be realized until there’s a way to collect, sort, and analyze that data to extract meaningful business intelligence from it. And no one knows more […]
Unsupported Operation 79IntelliJ IDEA 12.1.1 availableJavaZ - new functional patterns library for Java - looks interesting, but UGLYLambda Ladies - Recently started group to promote functional programming to women in techSonatype’s gateway to Central upgraded to Nexus 2.4 - what version is your nexus?JMS2, Bean Validation 1.1, JBatch, JSON-P go finalResteasy 3.0-beta-4 and 2.3.6.Final ReleasedRedline-RPM - Native Java RPM generation - no need for native rpm-tools installhttps://github.com/stephenc/non-maven-jar-maven-plugincucumber-testng-factory 1.0.1 released.KotlinfunKTionale 0.1.5 is readyScalaAtomic Scala print book now availableClojureClojureWerkz Money 1.2.0 - wrapper library for Joda MoneyRunning and debugging Clojure with IntelliJ IDEAlein-thriftc - Apache Thrift plugin for LeiningenGroovy2.1.3 availableApacheHttpClient 4.2.4 releasedMaven Compiler 3.1 releasedMaven Surefire 2.14.1Maven Shared Utils 0.4Wink 1.3.0 - Apache Wink is a simple yet solid framework for building RESTful Web services. It is comprised of a Server module and a Client module for developing and consuming RESTful Web servicesApache PDF Box 1.8.1Apache Wookie 0.14 - Apache Wookie is a Java server application that allows you to upload and deploy widgets for your applications; widgets can not only include all the usual kinds of mini-applications, badges, and gadgets, but also fully-collaborative applications such as chats, quizzes, and games. Wookie is based on the W3C Widgets specification, but widgets can also be included that use extended APIs such as Google Wave Gadgets and OpenSocialApache CouchDB 1.3.0Apache Struts 1 end of life - going to the AtticApache cTAKES becomes a top level project: (clinical Text Analysis and Knowledge Extraction System) is an Open Source natural language processing system for information extraction from electronic medical record clinical free-text. Widely used in production by numerous organisations across the healthcare sector, cTAKES was started in 2006 by a team of physicians, computer scientists and software engineers at Mayo Clinic, and was submitted to the Apache Incubator in June 2012Pig 0.11.1Apache Bloodhound 0.5.2 is a tool to track progress and defects in software products. Sits on Trac.The Apache Accumulo 1.4.3 - sorted, distributed key/value store is a robust,scalable, high performance data storage system that features cell-based access control and customizable server-side processing. It is based on Google's BigTable design and is built on top of Apache Hadoop, Zookeeper, and Thrift.Apache Syncope 1.0.7 is an Open Source system for managing digital identities in enterprise environments, implemented in JEE technology Apache Commons-FileUpload 1.3 - bug fixes, enhancements, drops pre 1.5 supportApache Rave 0.20.2 is a new web and social mashup engine. It provides an out-of-the-box, as well as extendible, lightweight Java platform to host, serve and manage OpenSocial, W3C and other web widgets.
Hortonworks HDP1, Apache Hadoop 2.0, NextGen MapReduce (YARN), HDFS Federation and the future of Hadoop with Arun C. Murthy
How about an insight engine application that runs in a browser for domain experts to explore data at web scale? That's called Big Sheets. In this episode, Stephen Watt, a software architect and Emerging Technologies Hadoop Lead at IBM, and Dan Gisolfi, an IBM Software Group Strategy Architect talk about the data deluge, Apache Hadoop, and Big Sheets. Also see Stephen's dW article, Deriving New Business Insights With Big Data.