Definitive, solutions-focused, and forward-thinking IT professional, with a successful career across the software development lifecycle; from initial feasibility analysis, conceptual design, and documentation through construction, implementation and quality review. Equipped with hands-on experience building, coding and directing successful information technology initiatives.
Expert in Hadoop Based Technologies with a focus on data integration and analysis. Driving new discoveries through Just-In-Time analytics, leveraging co-located datasets at scale to provide insight and pattern detection that wasn't previously possible. Building data pipelines that produce results in minutes or hours across peta-bytes of data. Building and discovering new ways to co-locate, integrate and leverage disparate datasets using the Lambda architecture.
Hortonworks, Inc. - Sr. Technical Director - A-Team (Engineering)
May 2013 -
Hadoop Professional Experience
- Architect / Design Hadoop Infrastructure
- 50+ Hadoop Cluster buildouts, small (10 nodes) to large (100's of nodes)
- Hadoop Operations and Mentoring
- Multi-tenant Hadoop Environments
- Development Best Practices
- Operational Best Practices
- Data Ingestion with Flume, Sqoop, Storm, and HDFS CLI
- Partitioning strategies and data formats in Hive/TEZ
- MapReduce programming
- HiveQL and Pig
- Cluster Conversions from Cloudera to HDP (from 1 to 7 PB clusters)
- Developed Cluster Conversions Process from MapR to HDP (7 PB's)
Telco Company Principal Architect / Developer
- Designed, implemented 500+ node cluster.
- Evangelist, training and mentoring infrastructure and operations groups to handle these environments.
- Designed, built and implemented integration systems that digest 200-400 billion records a day
- Built custom interface to lookup customer records across 15 trillion records in less than 30 seconds
- Designed Highly efficient ingestion process to organize datasets in a manner that provided flexibility and scale
- Developed custom reader for ORC files for application access
- Developed pre-tez caching mechanism for ORC header information using EhCache
- Built and tuned analytics solutions across peta-bytes of data on Hadoop platforms.
- Expert in Just-In-Time analytics, built to leverage "Schema on Read" datasets.
- Wrote MapReduce programs to digest and interpret machine log data
- Designed consolidation routines to efficient process non-contiguous data
- Experienced Map Reduce developer and an expert in higher MR languages like Hive and Pig.
- Mentor and work with customer to integrate systems across the enterprise. Sources include RDBMS back-ends, Messaging systems, static file delivery, device logs, web interfaces, custom solutions, vendor and prepackaged solutions.
- Train and mentor customers in best practices regarding Big Data security and governance
- Built automation process to handle data consolidation, validation and ingest
- Built custom JEE REST application distributed across 300 nodes to support Low Latency Query capabilities on pre-YARN architecture, yielding 10 second response times across 5 trillion records
Securities Regulatory Company
- Reviewed customer partitioning strategies across several years worth of exchange data
- Built custom partitioning strategy based on data sampling to account for sever data skew
- Debug / Troubleshoot customer windowing functions on HDP, with "Predicate Push Down"
- Provided guidance around performance implications of using S3 for storage of short-term data used in low latency analytics
- Plan and design an HDP dynamic cluster on AWS using a combination of ephemeral storage (SSD's), EBS and S3. With additional tiered storage capabilities in HDP 2.2.
Internet / Entertainment Company
- Lead successful migration from MapR to HDP
- Migrated 4 PB's of data, including an HBase conversion from 0.94 to 0.98.
- Built processes to manage the migration across 20+ client applications
- Created processes to migrate Textfile formatted tables to newer, more efficient ORC based tables
- Architected and Implemented NFS solution to HDP
- Mentored operational and infrastructure teams around HDP cluster management
- Lead team of Resident Architect's to assist customer with improving development efforts with Hadoop technologies
Market Exchange Company
- Code review Chef Scripts used for HDP base deployment and HA management
Data Aggregation Cloud Company Systems Architect
- Performed first Hortonworks upgrade from HDP 1.1 to 1.3
- Oversaw the upgrade of 3 clusters (400 nodes) from HDP to HDP 1.3
- Implemented Ambari via Ambari takeover to manage cluster
- Mentor customer on ORC benefits and usage
- Assisted with Customer migration from HDP 1.3 to HDP 2.1
Electronics Retail Company
- Executive and Technical briefing/design regarding network topologies, rack layouts and machine configurations
- Worked with various engineering groups (Infrastructure, Security, Operations) to establish best practices for the implementation and ongoing support for HDP
- Implemented their first Hadoop Cluster on Premise on top of OpenStack
Rental Car Conglomerate
- Consulted on Storm performance issues
- Resolved issues and gained 600% performance increase in Storm Topologies
- Uncovered Leak in HBase Connection Factory causing failures in HBase 0.96
- Implemented alternative approach and worked with Hortonworks engineering to patch leak
Major Home Improvement Retail Company Principal Architect
- Assisted with conversion from Cloudera to HDP
- Mentor executive sponsors and developers on Hadoop ecosystem and automation frameworks (Oozie)
- POC'd ingestion patterns with Oozie / Pig / Hive / HCatalog
- Review and mentored development teams transitioning to Hive 0.11
- Guided performance testing against HDP 2.x for planned migration from HDP 1.x
Developer / Architect
- Reviewed infrastructure and cluster installation
- Discovered cluster infrastructure issues that had degraded the clusters performance by 50%
- Executive briefings on Hadoop technologies
- Built MapReduce program to fixed patterns program originally written in Hive (UDF's) while reduce runtime by 70% and increasing cluster utilization by 300%
Architect / Operations / Developer
- Installed first cluster for customer
- Mentored team on Cluster operations
- Migrated legacy Security Intrusion process to Hadoop, reduced time for investigation from 48 hours to 6 minutes, while providing a greater range of access to data
- Improved retention capabilities
- Created custom loader in Pig for MS XML Events logs (https://github.com/dstreev/PigStandardParserUtils)
Housing Rental Company Principal Security Architect
- Executive Briefing on Hadoop Security and Governance
Government HealthCare Company
- Lead HDP upgrade effort from HDP 1.3 to HDP 2.1 across 3 clusters
- Architected consolidation of various Hadoop clusters into a central Data Lake
- Reviewed current customer ingestion and processing practices
- Built POC with customers data showing an improvement in data processing by 600%
- Integrated customer data on HDP with SAS
Online Singles Company Principal Architect
- Lead Hadoop cluster rollout for customer
- Mentored teams on usage of Hadoop Technologies like Sqoop, Hive, Pig, Flume
- Helped customer establish best practices using Hadoop based technologies
- Provided guidance on converting data sets from MSSQL to Hive and provide effective partitioning strategies
- Tuned HiveQL for customer, providing access to historical datasets that wasn't previously available in MSSQL
- Debugged and discovered data integrity issues with assumed business knowledge against current datasets, leading to more effective analytics
Other Professional Experience
Equifax Global Enterprise Application Architect
Take full responsibility in establishing and implementing corporate software and development standards. Provided effective leadership launching Hadoop (Big Data) technologies at Equifax. Determine various technologies and potential uses at Equifax. Develop and submit proposals to IT Senior Leadership, informing them of new technologies and defining the strategic value of such technologies throughout the enterprise.
- Successfully launched and presented technology to IT senior leadership, helping them to understand the technology and its potential uses from a global perspective.
- Conceptualized detailed designs for corporate Hadoop backbone to eventually be used across the organization.
- Supported business unit architects on common use patterns and applicability.
- Coordinated the first effort in vetting the technology, which resolved long-running real-world business issues within two weeks.
- Developed and implemented analytic processes in cooperation with business analysts; converted the concept into a strategy that solved a complex problem.
- Managed and configured several Hadoop implementations (including Kerberos).
- Worked in partnership with international and domestic business unit development shops to build a corporate-wide software development life cycle (SDLC) standard.
- Conducted research on SDLC reference model and presented findings to Senior Leadership for acceptance.
- Coordinated with purchasing, operations, and infrastructure in implementing corporate-wide SDLC environment to promote the solution across the organization.
- Developed corporate standard on address standardization in support of business unit architects.
- Created and updated address standardization APIs for general consumption that could be used to abstract various vendor implementations.
McKesson Provider Technologies
Integration Architect - Consultant
Strategically formulate and deliver solution for the integration of McKesson's corporate portfolio management system with development project management tools used by various business units. Manage the design and implementation of an "Integration Hub" for synchronizing data between CA Clarity and Rally, with potential later integrations to JIRA and TFS taken into account. Streamline CA Clarity access through Clarity's XOG Web Service interface. Built a framework specifically for accessing CA Clarity, then utilized that framework as a component of the Integration Hub. Render direct assistance to the management in realizing the benefits of an end to end development platform.
- Helped provide multiple access options through Rally, including a WSDL (based on Axis 1.4) and a RESTful API implemented in JSON and/or XML.
- Designed a framework to enable and access Rally within RESTful JSON API.
Took part in the implementation of The Hub's "Eventing" model using ActiveMQ as an embedded messaging system through a Pub/Sub (Topic) to Queue (point-to-point) implementation through "Virtual Destinations".
Java Systems Architect, Macy's Systems and Technology
Played a major role as subject matter expert for Loyalty platforms. Communicated with Macy's regarding their new rewards platform initiative. Formed and implemented core logical "entity" models for the system as well as transitioned logical models into physical models to build out TIBCO backend components. Served as a consultant and worked in collaboration with other groups, including 3rd party vendors, corporate architects, senior management and business unit executives, for the design decisions and implementation details for the product.
- Designed and presented several key business processes through Activity Diagrams that captured 'Dynamic' system behaviors.
- Formulated Web Service interfaces (Abstract WSDLs and XSDs) to feed transactional information into the platform.
- Utilized TIBCO's BusinessEvents rules engine with Orchestration through TIBCO's BusinessWorks.
Java Systems Architect
Intercontinental Hotels Group
Functioned a lead role as the key contact for a migration project that converted parts of an MVS Mainframe system to Java Distributed Technologies, as Service-Oriented Architecture Architect/Technical Lead. Coordinated particularly 680 million reservations with sensitive personal data from historical sources dating back two years. Spearheaded a team in developing and providing ongoing real-time services that persisted the reservations made on the Mainframe to Oracle 11g. Worked in collaboration with data center and operations personnel in creating new hardware and technology stack. Oversaw SOA-based architecture on new technology stack.
- Coordinated with database administrators to determine partitioning strategies that scaled to support two years' worth of historical data and 12 months of future data collected by the system.
- Rendered technical assistance to requirements within network configurations, load balancing, Oracle Clustering, and MQSeries Clustering.
- Developed and maintained Web Services through WSDL to Java with Apache CXF.
- Provided solution architecture based on Spring 2.5, OSGI and ServiceMix ESB.
- Installed and configured Maven build scripts for multi-module projects.
- Gained recognition as technical champion and mentor for "Canonical" enterprise model by designing methods saving thousands of lines of code while enabling development to react promptly to changing business rules and situations.
Distributed Systems Engineering
Secured working relationship with the EFT (Card Services) business unit by proactively participating in the design and configuration of a DR solution.
- Successfully supported Fiserv Enterprise Technology in closing the gap with development, particularly on operational readiness and awareness.
- Designed and implemented a system to oversee numerous WebLogic domains used by business units and managed by the Enterprise Technology Division, which allowed WebLogic administrators to plan and configure their domains offline.
- Assumed operational engineering responsibilities, including Linux administration (RHEL and SUSe) and HP Server Automation training and provisioning.
Senior Director – Principal J2EE Architect
Analyzed, designed, and proposed core credit card rewards program for the next generation, post-mainframe rewards platform being adopted by TSYS (Total System Service, Inc. TSS) to support several customers, such as The Home Depot, Capital One, Washington Mutual, Toronto Dominion, and RBCC. Oversaw the execution of a brand new line of business for ESC Loyalty. Acted as a visionary behind the overall platform and implementation. Cooperated with the business development to ensure the platforms capabilities matched those of current and prospective clients. Reported directly to client executives regarding the platforms capabilities and future directions.
- Took part in carrying out envisioning, development, and deployment of first customer, The Home Depot, on the platform in 6 months.
- Significantly generated $25M business line by subsequently acquiring customers using the platform, including Capital One - No Hassle Rewards, Washington Mutual, Toronto Dominion, RBCC, CB&T, and FiServ.
- Served as architect and principal developer, overseeing all development efforts throughout the next generation rewards engine.
- Facilitated training and supervised a team of 12 developers, embracing new technologies, such as Hibernate (JPA), JavaServer Faces, MVC, JEE, and XML.
- Adhered to companywide best practices and policies for new technologies, code management, and traceability to ensure projects were SAS70 compliant.
- Identified bar for source control, overseeing five different code branches that supported deployments in Development, QA, UAT and Production for multiple client implementations.
- Achieved a successful completion of all projects built from source control through automated ANT build scripts.
Director / Architect / Principal Developer
Led and supervised a team for Enterprise Management Console application for an IDS sensor. Delivered solution built with a J2EE infrastructure and other components, such as were JMS for asynchronous messaging, Web Start for dynamic client deployment, PostgreSQL for data persistence, and Java Swing for client GUI. Worked closely with customers and sales team to deliver operative and scalable platform.
- Effectively deployed a synchronous request and response mechanism through use of messaging (JMS) subsystem.
- Acted as project head in a start-up environment; formed and cultivated a team to work on the development of various architecture platforms.
HDP 1.3, 2.0, 2.1,2.2
- Pig 11,12,14
- Hive 0.11,0.12, 0.13,0.14
- HDFS / YARN
CDH 3.x, 4.x
- Pig 10
- Hive 0.11
Other Technical Capabilities
Java EE 5,6,7
Oracle, MySQL, PostgreSQL
Red Hat 5,6
Web Services, SOAP, REST
Education and Certifications:
- Bachelor's degree, Aviation Mngt / Flight Technology – Florida Tech
- Hortonworks: Certified Hadoop Administrator 1.0 and 2.0
- Cloudera Certified Developer for Apache Hadoop (CCDH)
- Certified MongoDB Administrator / Developer
- TIBCO BusinessEvents
- HP Server Automation