< Return to Big Data Analytics page

InfoSphere DataStage integrates data on demand across many systems via a high performance parallel framework, extended metadata management & enterprise connectivity.

 

  • Highlights
  • Features & Benefits
  • Related Products
  • Resources

InfoSphere DataStage provides massively scalable capabilities that enable companies to solve large-scale business problems, integrating with all types of data - 'big' or 'small' - to help you quickly adapt to your changing business environment.

  • Supports the collection, integration and transformation of large volumes of data, with data structures ranging from simple to highly complex.
  • Provides support for big data and Hadoop, enabling customers to directly access big data on a distributed file system, thereby helping you address the most challenging data volumes in your business. In addition to supporting InfoSphere Big Insights, connectivity options also exist for cassandra, hdfs, hive, hbase, mongodb, and other nosql databases.
  • Offers additional Balanced Optimization for Hadoop capabilities to push the processing to the data for greater efficiency.
  • Supports big-data governance, including features such as impact analysis and data lineage.
  • Offers workload management capabilities to optimize hardware utilization and prioritize mission-critical tasks .
  • Allows you to harness the power of business rules management, with integration directly with IBM Operational Decision Management (formerly ILOG JRules).
  • Features such as an operations console and an interactive debugger for parallel help customers work smarter, enhance productivity and accelerate problem resolution.
  • Supports real-time data integration, and supports connectivity between any data source and any application.
  • Enables developers to maximize speed, flexibility and effectiveness in building, deploying, updating and managing their data integration infrastructure.
  • Languages supported: French, Korean, Chinese (Simplified and Traditional), Spanish, Portuguese-Brazilian, German, Japanese, English, Italian

Next steps

Contact us today to learn how IBM InfoSphere DataStage can help your company to maximize performance - you can complete the form or call us at 877-454-4898, and we would be delighted to consult with you and make specific recommendations.

IBM InfoSphere DataStage and IBM InfoSphere DataStage for Linux on System z provide these unique capabilities:

  • The powerful ETL solution supports the collection, integration and transformation of large volumes of data, with data structures ranging from simple to highly complex. IBM InfoSphere DataStage manages data arriving in real-time as well as data received on a periodic or scheduled basis.
  • The scalable platform enables companies to solve large-scale business problems through high-performance processing of massive data volumes. By leveraging the parallel processing capabilities of multiprocessor hardware platforms, IBM InfoSphere DataStage Enterprise Edition can scale to satisfy the demands of ever-growing data volumes, stringent real-time requirements, and ever shrinking batch windows.
  • Unique support for big data, making it easier and more efficient to explore and integrate with big data, to quickly get to the next level of analysis. InfoSphere DataStage provides support for InfoSphere BigInsights; connectivity options for cassandra, hdfs, hive, hbase, mongodb, and other nosql databases, offers Balanced Optimization for Hadoop (to push processing to the data), IBM InfoSphere Streams integration (to provide direct data flow integration to gather and pass information to real-time analytical processes), big data job sequencing, plus features to support big data governance: (such as impact analysis and data lineage on any big data integration points).
  • Workload management capabilities enable policy-driven control of system resources and prioritization of different classes of workloads. Customers can use new workload management capabilities to optimize hardware utilization and prioritize mission-critical tasks, throttle job activities where resources exceed specified thresholds, and assess, assign and reassign the priority of jobs as new jobs are submitted into the queue.
  • Harnesses the power of business rules management to more adapt quickly to changing business requirements. InfoSphere DataStage can integrate directly with IBM Operational Decision Management (formerly ILOG JRules), allowing organizations to make a giant leap forward in bridging the gap between business people and IT by implementing decision logic using IBM Operational Decision Management within InfoSphere Information Server.
  • Comprehensive source and target support for a virtually unlimited number of heterogeneous data sources and targets in a single job includes text files; complex data structures in XML; ERP systems such as SAP and PeopleSoft; almost any database (including partitioned databases); web services; and business intelligence tools like SAS.
  • Real-time data integration support operates in real-time. It captures messages from Message Oriented Middleware (MOM) queues using JMS or WebSphere MQ adapters to seamlessly combine data into conforming operational and historical analysis perspectives. IBM InfoSphere Information Services Director provides a service-oriented architecture (SOA) for publishing data integration logic as shared services that can be reused across the enterprise. These services are capable of simultaneously supporting high-speed, high reliability requirements of transactional processing and the high volume bulk data requirements of batch processing.
  • Advanced maintenance and development enables developers to maximize speed, flexibility and effectiveness in building, deploying, updating and managing their data integration infrastructure. Full data integration reduces the development and maintenance cycle for data integration projects by simplifying administration and maximizing development resources.
  • Complete connectivity between any data source and any application ensures that the most relevant, complete and accurate data is integrated and used by the most popular enterprise application software brands, including SAP, Siebel, Oracle, and PeopleSoft.

Flexibility to perform information integration directly on the mainframe. InfoSphere DataStage for Linux on System z, provides:

  • Ability to leverage existing mainframe resources in order to maximize the value of your IT investments
  • Scalability, security, manageability and reliability of the mainframe
  • Ability to add mainframe information integration work load without added z/OS operational costs

Next steps

Contact us today to learn how IBM InfoSphere DataStage can help your company to maximize performance - you can complete the form or call us at 877-454-4898, and we would be delighted to consult with you and make specific recommendations.

InfoSphere DataStage Pack for Data Masking

IBM InfoSphere DataStage Pack for Data Masking provides in-line capabilities for masking sensitive data in extract, transform, load (ETL) jobs for information integration

Mask sensitive data in ETL processes to help reduce compliance costs and security risks

  • Data privacy is a global issue. Many data privacy requirements are a result of specific legislation that requires companies to maintain a certain level of privacy and security regarding their customers’ data. Given that data is a critical asset at the core of every business, it must be protected in every phase of its lifecycle within an organization - data privacy is a key element to protecting the privacy of enterprise data in non-production environments.
  • For integrated, in-line, data masking for ETL processes, the InfoSphere Data Stage Pack for Data Masking leverages the InfoSphere Optim capabilities to help ensure realistic test data for application testing while adhering to customer privacy requirements.
  • In-line masking capabilities designed to ensure sensitive data used in InfoSphere DataStage ETL processes is not exposed to development or test teams
  • Pre-configured masking policies within InfoSphere DataStage for common business data types such as e-mail address, customer name, address, social security number, credit card number
  • Leverages the InfoSphere Optim proven technology for data masking capabilities
  • Maintains semantic and referential integrity of the masked data within InfoSphere DataStage for realistic, reliable, application testing
  • Supports masking for complex file types including mainframe files and EBCIDIC files
  • Leverages the DataStage user interface and parallel processing framework for consistency, re-usability, and scalability
  • Leverages the common Information Server platform for data lineage to increase visibility into the overall masking flow within the ETL process

InfoSphere DataStage MVS Edition

InfoSphere DataStage MVS Edition provides native data integration capabilities for the mainframe.

It supports integration of legacy mainframe data with other enterprise data - consolidating, collecting, and centralizing information from various systems and mainframes.

  • Generates COBOL applications and the corresponding custom JCL scripts for processing all mainframe flat files, DB2®, IMS™, VSAM and Teradata
  • Accelerates building a data warehouse on the mainframe by providing access to mainframe data sources
  • Provides a powerful graphical user interface to design ETL processes that runs natively on the mainframe
  • Generates COBOL programs and JCL that are easily extended
  • Offers access to mainframe flat files, IBM IMS, VSAM, IBM DB2, and Teradata files
  • Supports real time data integration and end-to- end metadata management

Next steps

Contact us today to learn how IBM InfoSphere DataStage can help your company to maximize performance - you can complete the form or call us at 877-454-4898, and we would be delighted to consult with you and make specific recommendations.