Saturday, February 21, 2015

Working with Compression on HDFS

Copy and uncompress file to HDFS without unziping the file on local filesystem

If your file is in GB's then this command would certainly help to avoid out of space errors as there is no need to unzip the file on local filesystem.

put command in hadoop supports reading input from stdin. For reading the input from stdin use '-' as source file.

Compressed filename: compressed.tar.gz
  
gunzip -c  compressed.tar.gz | hadoop fs -put - /user/files/uncompressed_data

Only Disadvantage: The only drawback of this approach is that in HDFS the data will be merged into a single file even though the local compressed file contains more than one file.

Practically used it today...while working with realtime problem...

Thanks to below blogger
http://bigdatanoob.blogspot.com/2011/07/copy-and-uncompress-file-to-hdfs.html

Sunday, February 15, 2015

Big Data solution for My Retail client

We wanted to provide best marketing campaigns, coupons, and offers down to the individual customer. Direct customer relationships are a privilege, but it  also requires processing the  massive amounts of data to provide the best winning prices to the customer at point of sale is one of the complex task .Our client is using Hadoop to process large data. The Merchants generates the offers into the portals for particular week or month on the basis of regular price , promotional price & Clearance price ,  for the group of products and for group market across the globe.Based on promotional offers transaction, we conduct explosion of data using Hadoop by generating the promotional offers to help retailers make informed decisions about pricing, promotions, and assortment management. These offers then flow into the Hadoop system for further processing.We explode these offers using PIG and HIVE for all the products and store across the globes. In product explosion we generate the promotions data for all the products which belongs to product groups. In market explosion we explode these data for respective stores across the globe. The details about the product group and market category during product explosion and market explosion ,we get from the internal datawarehouse. During these transformations we check store authorization for the category of that product for respective stores and many complex business rules. Based on this data, the Hadoop calculates the optimal promotional retail price for product. The billons of promotional data set then generated and processed using Hadoop PIG and HIVE transformations. We do chaining and collision within generated promotional data. Then out of billion records we categorize the winning price or winning data and looser data. Hadoop provides a near complete ecosystem where we run batch and ETL-type processing, analytics, store data, and process data faster for billions of records. We store data in HDFS and process data using HIVE and PIG that enable analytics of this promotional dataset. We run multiple transformation jobs and deliver information to multiple systems. We then send this winning data (after pricing optimization) every day to our point of sale (retails stores), where during the purchasing of retail product by customer, our associate will provide the promotional price applicable to that particular time. We also send this winning data and looser data to our datawarehouse for reporting purpose to our merchants so that they can generate BI reporting out of that. 

Saturday, December 6, 2014

What’s New in IBM Cognos 10.2 (Now we can use it for Big Data too)


Cognos Workspace (Formerly Business Insight) It seems a little strange to say we’re glad that IBM Cognos 10.2 delivers fewer “insights”, but the renaming of Business Insight and Business Insight Advanced to Cognos Workspace and Cognos Workspace Advanced respectively, is a welcome change. Hopefully this will help to ease some of the confusion that’s arisen in recent years among a myriad of “insight” labeled components in within the IBM Business Analytics solution portfolio. Along with this re-branding, come a number of useful new features.
Multi Tabbed Workspaces A major enhancement to Cognos Workspace is the ability to easily create multi-tab workspaces enabling you to expand your workspace’s visual footprint without the need to scroll. With tabbed workspaces also comes a new control known as an Action Button which can be programed to trigger tab changes.
Freeze and Unfreeze Column and Row Headers Excel users rejoice, they’ve added the functionality to freeze and unfreeze crosstab column and row headers while scrolling. Data Visualization Guide As mentioned previously, the theme of applied data visualization theory is a recurring one in Cognos 10.2. The new Visual Recommender in Cognos Workspace will help you to select the appropriate chart type based on the data values in your workspace, along with some insightful rationale for that decision.
Graduated Capabilities Administrators can now assign graduated capabilities to the Cognos Workspace tool. Users can have the option of Authoring, Interacting or just Consuming any Cognos Workspace. This is useful in controlling the available Cognos Workspace feature set for governance purposes, or to even further simplify the experience to drive adoption in a user population of diverse skill sets.
Other Features Cognos workspace 10.2 also brings support for printing and use in Google Chrome and Safari.
Cognos Mobile Cognos Mobile was a major focus in the 10.1 (General) and 10.1.1 (Refresh Pack) releases, with outstanding functionality and support across a broad set of devices. In 10.2 Mobile sees a few minor, but noteworthy updates.
  •  Push Notifications Cognos Mobile now supports push notifications to the status IOS status bar in the iPad app – users can now be notified when a new version of their report is available, further reducing the latency in decision making on the go. Improved Performance of Multi-page Reports Multi-page reports can now be streamed to devices, reducing the loading time. Users no longer have to wait for the entire report to download before viewing.
  • Administrators Can Now Secure Mobile Access A new mobile capability in Cognos Administration allows administrators to govern access to Cognos via Mobile devices. This has benefits from a governance perspective for all organizations that deploy Cognos Mobile, especially those that employ a Bring Your Own Device (BYOD) policy
  • Report Studio Prompt API The new prompt API is a long overdue and welcome update to Professional Authors and Report Studio hackers everywhere. The API is 100% documented and supported, and like any good API can be expected to persist across future product renditions. The API provides a documented method for setting, reading, deleting and validating prompt values using JavaScript. The API is supported in Report Studio, Cognos Viewer, Cognos Workspace and Cognos Workspace Advanced. 
  • Excel Improvements Expect improved Excel compatibility of Cognos report outputs with increased spreadsheet maximums to 16,384 columns by 1,048,576 rows. Cognos 10.2 also brings a brand new Excel output format called Excel 2007 Data which is perfect for lightweight data transfer to Excel without any report formatting. 
  • Dynamic Cubes By far the most exciting new feature in Cognos 10.2 is the addition of a new OLAP technology to the IBM Cognos solution suite which in conjunction with pre-existing OLAP options provides developers with the option of the right tool for the job at hand.The Dynamic Cube solution builds upon existing Dynamic Query Mode (DQM) functionality to close a major gap that exists in many vendor OLAP offerings: the organization with a large or mature star or snowflake schema data warehouse that wants to provide an OLAP experience to their users without sacrificing data details or high performance. Here are a few highlights of this new feature:
  • Big Data Support: Hadoop / Hive via JDBC Connector SQL Server and Analysis Services 2012 Salesforce.com (Native DQM) SAP ECC (R3) (Native DQM) Siebel (Native DQM) The big news here is native support for Big Data and seamless integration with IBM InfoSphere BigInsights (Hadoop). Additionally, native support for a number of ERP vendors has been added to Cognos via Dynamic Query Mode for increased performance and reduced implementation complexity. Prior to 10.2, support for data sources such as Saleforce.com, SAP ECC and Siebel required a connector via Virtual View Manager (VVM). We should also note that VVM 10.2 will be the final release of this solution, and while it will continue to be supported, developers should take note that it is now deprecated in favor of DQM. 100% 64 Bit These remaining 32 bit components of IBM Cognos 10 get an update for full 64 bit compatibility: BI Gateway, Metric Studio and Data Manager Multi-Tenant Support Cognos 10.2 now supports native multi-tenancy. While this feature may not have mass appeal, it will be incredibly valuable to certain IBM Cognos OEM partners as well as organizations that have deployed Cognos in a federated manner and are looking to more easily segment different business units or sub-organizations within their shared IBM Cognos platform architecture. 
  • Content Archiving Last but not least, Cognos 10.2 now natively supports fully integrated content archiving to a file system. This feature will be coveted by organizations that have content retention and audit requirements that are looking to keep their Cognos content store free from large report outputs to maximize their environmental performance. This new feature has much of the same archival functionality that IBM FileNet customers have enjoyed since the 10.1.1 release.

Cognos Insight

Finally I got some time to learn and update myself on  IBM BI latest tools.
Its been 7+ years I have been working in Cognos BI tool.
The Cognos Insight will be the next big thing in the market because IBM beautifully integrated TM1 and Cognos BI workspace capabilities in Cognos Insight 10.2 version.

Cognos Insight 10.2 Cognos Insight, released in Early 2012, is the Personal Analytics solution that has taken the marketplace and user’s desktops by storm, garnering a #1 ranking among all Self Service Business Intelligence Platforms in The Forrester Wave (Q2 2012). Cognos 10.2 continues to improve ease of use, and integration with IBM Cognos Enterprise to encourage collaboration and enterprise distribution while avoiding data silos that other competitive solutions can create. What’s the use of personal insight if the entire organization cannot stand to benefit?
Tree Maps Customers and Partners alike have clamored for improved data visualizations and IBM has responded in kind. The addition of Tree Maps to Cognos Insight pays homage to the applied data visualization theories of luminaries such as Edward Tufte and Stephen Few and it is a theme that is quite prevalent in Cognos 10.2. Unfortunately, we’ll have to wait for Tree Maps in the enterprise studios as this is a Cognos Insight-Only feature for now.
Smart Metadata A new data discovery engine in Cognos Insight will automatically detect dimensional levels in hierarchies and even differentiate between numeric attributes and measures. This means users will spend even less time importing data, further shortening the learning curve of this already intuitive platform.
Drill-able, Drag and Drop Charting and Top/Bottom Filtering Users can now drill up and down directly on charts, just as is possible in Cognos Enterprise, and they can also drag and drop dimensions and measures directly onto charts as opposed to having to retain a crosstab in prior versions. Also included is rapid top/bottom filtering in Crosstabs that is always a simple right click away.
Package Import Prior to this latest, release users could import data into Cognos Insight from existing IBM Cognos Reports. IBM takes the story of unified metadata one step further allowing users to source data directly from the same ad-hoc reporting packages that are published in the Cognos Enterprise environment. For IT and business alike this means master data management and data governance from the enterprise to the desktop: unified business logic and terminology at every step of the way.
Time Rollups Cognos Insight now has out-of-box functionality to build custom time dimensions or roll-ups. Populate entire years regardless of how sparse your data may be or customize it to match your organizations fiscal calendar.
High Fidelity Publish Customers who also have IBM Cognos TM1 Enterprise can publish their Cognos Insight dashboards to Cognos 10.2 Enterprise as full-fledged, tabbed Cognos Workspaces. This means that Cognos Insight users can share their analyses with other Cognos Enterprise users who do not have Cognos Insight capabilities. This is huge from both a customer licensing and a usability perspective as prior releases required that any users who wanted to view an Insight dashboard must have Cognos Insight capabilities, which came at the higher Advanced Business Author licensing level. To view a Cognos Workspace a user need only be at popular Enhanced Consumer license tier. What’s a Cognos Workspace you ask?

Go through some training videos below is the link.

http://ibmtvdemo.edgesuite.net/software/analytics/cognos/videos/HTVs/ci102/index.html

It has what if analysis ,budgeting and planning and visualization capabilities.
It automates the development of TM1 applications which reduces the time for end to end deployment for small or medium applications.
It completely based on self service model which provide lot of capability to end users to get  their things done without involvement of IT.
I am sure in upcoming versions IBM will leverage this tool with lot of capabilities for data visualization on Big data or HDFS (Lets see).

I think we can replace the transformer cubes by using the Cognos insight by publishing the CDD workspace file on IBM Cognos BI Connection as workspace.


I will be going to start POC soon.
Let me know what other folks think about Cognos Insight.


Monday, August 29, 2011

Features list for Mobile Reporting

Few days back, I got an opportunity to prepare features list for Mobile reporting solution tool.
We all know that days are not too far when everyone will move to I pad/touch-pad or Smartphone to access the real time BI applications.
Mobile Users need business intelligence data wherever they are to make information-driven decisions. Whether it’s salespeople who wants product details or executives who need bird eye view of the business. Mobile BI apps should be able to deliver information by providing interactive reports, dashboards, metrics and other to user’s mobile devices.
There are lot of tools available in the market.
Having knowledge of these features certainly helped me understand the tool better.
I can now use this feature list to compare various options which are available in the market so as to give informed consultancy to my clients.
My personnel opinion is for Cognos go mobile because in version 10 they leverage lot of functionality to satisfy the needs for mobile users.

Below is the link to feature list:


https://docs.google.com/spreadsheet/ccc?key=0AiobrRxK74dwdGpzQjYyMW92b052WnFJZ2xjbndveGc&hl=en_US

Wednesday, March 30, 2011

Diplaying the List Column header in Vertical mode

Some time we need to display the column header in vertical mode.
You can rotate the text 90 degrees by using the TextFlow & Justification property for the column titles and set the Writing Mode to "Top to Bottom, Left to Right" You will find this by clicking on the column title in the report and then scrolling down in the Properties for the title on the left hand side.

Working with Compression on HDFS

Copy and uncompress file to HDFS without unziping the file on local filesystem If your file is in GB's then this command would cer...