Ian Abramson
Toronto, Canada

Hear a Sample
of this product!

Use the player below to hear an
interview with Ian Abramson,
author of Oracle Database 11g,
A Beginner's Guide
. Listen now
or download the audio samples.

Download Interview

Interview (MP3)

Your Name :
Your Email :
Subject :
Message :


AUTHOR BIO:Ian Abramson is the past-president of the IOUG, the Oracle technology user group, and an Oracle data warehousing expert with more than 20 years of experience. Based in Toronto, Canada, he is the director of the Enterprise Data Group for Thoughtcorp, a company that focuses on business solution architecture, solution development, system integration, data warehousing, and business intelligence. Ian has written numerous books on both Oracle and data warehousing and is a frequent presenter at Oracle and industry conferences and seminars. He is the co-author of Oracle Database 11g Beginners Guide Ian is an Oracle ACE focused on BI and data warehousing

SERVICES:Oracle DB, data warehousing, business intelligence, strategy, architecture


Oracle Database 11g A Beginner's Guide
Published on: 2008-12-18
Copyright: 2009
More Info

Oracle Database 10g: A Beginner's Guide
Published on: 2004-03-31
Copyright: 2004
More Info


Bricks and Mortar Transform to Data and the Internet

Today we look at the transformation that retail is under. We experience it every day of our modern lives.  Unfortunately there are those who do not make it. Target Canada announced last week that they planned to close their Canadian operations. They had significantly under-performed in Canada and as a result they are closing all of their stores. In addition Sony will be doing the same and other have like Mexx and others. 

Is the reason that these stores are closing due to the economy? Are they due to the miscalculation of what the Canadian consumer desired? Maybe. I look at it a different way. I believe that the way we consume today is different and as a result the bricks and mortar companies of today need to transform as well. You can look to the Internet to see what we need one needs to provide. You need to have an experience that bridges the online and offline experiences. Consider that Target opened over 100 new stores in Canada but yet they lacked an online website. It is not rocket science. Today I begin my shopping on the web and then decide if I will venture to a store to have a hands-on experience where I can now compare product to the others in the store or I purchase it online. When a company does not even have an eCommerce site I am already looking elsewhere. 

Our almost unlimited access to information means that the shopping experience must address the customer experience from both in-person and online. In a world where data is king and information is power companies and retailers specifically need to provide both experiences to truly embrace a customer. The information you can then gather about your customers and their habit allow companies to exceed the declining traditional business model. All retailer in today’s digital society must provide the service and use the information to drive business and drive value or the bell will ring for them just as it has for Target in Canada.

Data Takes its Rightful Place in 2015

The year comes to a close and as a data professional I look back at it and I see a world which is embracing and rejecting data at the same time. I talk to people who are not in the IT field and they are asking me about data, this is a sign that things must be getting interesting. They ask about Big Data, they ask about data security and privacy of their data and they talk about how data can be used for a greater good, but they are always concerned with what it all means. All of these topics are important and as we embark on the year of Data. 

Data is taking a prominent role in how our everyday lives are being influenced. I see it from the systems-side where we are considering the data that each application generates and captures as key to the value that the application will provide. We live in a world where the datafication of everything is goal. The Internet of Everything is no longer a pipe dream. We need to begin to look at how we can make this happen and 2015 is when many of these aspirations will be realized.

Data requires trust. The concern that many of us have is that we understand the value information can provide. Data and the information and knowledge that it generates means that we can improve the present and change the future. However we need to get better with how we treat and protect data. Security of information needs to be our first priority. If we can protect data from malicious uses we can also provide people with the confidence that their data is being used to help and not harm. Data protection based on the recent breeches at Sony and Home Depot shows us that we must be more vigilant in security. So for 2015 we need to resolve to try and put in place very good data security and to encrypt all the data that we can.  Even something as simple as changing passwords can make a difference. So for data to achieve its full potential everyone must do their part and secure their own local data as well as data you place in the Cloud. With so many forms of data we must ensure we address all data access and all data sharing. Consider if you use something like Google docs. You are abdicating responsibility for your data security to Google.  If Google makes a security mistake they will apologize to try and be better. When others get their hands on your data the story is personal and very different. It can mean the difference to winning and losing in business. So you must question the security of the Cloud and how well you can manage its security. Do not depend on others for your data security. Tough security may provide some challenges but in the long run will be the best for you and for your data. This security will lead to data sharing and truly maximizing the power of data as we relate datasets together in new and valuable ways.

So 2015 will bring us new ways to look at data. Big Data as a technical specialization has arrived and I see more people who understand that it is part of the greater data solution. Big Data does not stand on its own but rather is now a critical part of your overall data strategy. As a result we can bring information together which in the past was not possible.  This is a time where information should drive actions. In the past we looked at data to confirm what we wanted to know. We needed data to provide validation that our work us valuable and impactful. The new paradigm of analytics allows us to consider how to provide better service and solutions based on statistically accurate predictions created to respond to direct questions. The ability to ask the right questions and get the information in a timely manner is the goal which I try to embed within data solutions and which will allow for data to make the impact it must.

And the following Infographic which I recently found shows us how quickly we are generating data in 2014.. and consider that 2015 is only going to get crazier for data so its time for data to take it's rightful place in business.


Data in the Time of Change
Back at it... I have been working on various other social forums of blogging and writing but I am back at it in my original blog. Time to get data to work.

Data is what I continue to focus on today. All aspects of data and how it should be used for the greater good.

Here is something I wrote recently for the IOUG and thought you might find interesting. I am also finishing up some ideas for 2015 which should be coming soon. I resolve for 2015 to be a better blogger.

Data in the Time of Change

By Ian Abramson, EPAM Systems Inc.

. We look at changes in technology, processes changes, professional changes and others which are impacting us as Oracle and Data Professionals. I find it intriguing that in my everyday job at EPAM I am one of those who helps companies change how they treat and use their data. Big Data and advanced analytics is all about how organizations and individuals are changing to the new data paradigm and I am lucky enough to be part of it. We are empowering the individual to make choices based on an unprecedented access to information which we can now provide. 

The change has been in many ways a gradual change and in other ways very sudden. The technology in Big Data ecosystem is one which changes faster than Superman in a phone booth.  We see releases from Apache, the various Hadoop distributions along with new projects which surround the technology becoming available each and every week. In August Apache released Spark for general release. Spark allows you to now interact with Hadoop up to 100 times faster. They do this by changing how they get data from the cluster as they reduce overhead and do as much work in memory as possible. Why is this important? It is important as existing Hadoop installations will want to find better performance compared to the existing tools. So now you have a choice. Do you begin using Spark or if you are using Cloudera should you use Impala? You know that you need to change but the question is which change will bring the most benefit. Big Data requires one to embrace change.

Besides the technology evolution we are seeing more desire for businesses to change. Big Data has opened up new opportunities within business. We now are providing them with access to information which previously had been unavailable or difficult to access and analyze. So we need to be sure we control this access because with great power comes great responsibility. Embrace change but make changes to improve the experience or the organization not simply to change.

The following infographic illustrates how data and analytics is changing and how we can make a difference if we can all evolve and change:

history predictive analytics

The Everything of Data

When do things change from hype to reality? Or when is it just more hype?  I have previously discussed how Big Data is changing everything; it had moved from simple hype to complicated reality producing very interesting results. The latest hype is the concept or the reality of interconnecting all of our data via the Internet of Things (IoT). This concept is one which should improve how things work. Not just businesses but I mean machines, distribution systems and people. I have to admit that I truly love the idea of being able to share information in a way that is meaningful and impactful through sharing insights and recommendations and ultimately improving our lives. We live in a complex world; now imagine your data becoming your life optimizer. 

The notion that all the data morsels we generate in our everyday lives and all of the sensors in our machines generate lots of information but generally these components never talk to each other. This connection between individuals, public sources, businesses and in our infrastructure grid can connect in a seamless way on to work together for a better individual experience and a improved global performance.  We can have simple things like our home dryers waiting until the power grid has a lower demand before starting to dry your clothes. Or that a doctor in a remote part of Canada can search a medical database of all medical symptoms; their treatment an outcomes to tailor a healthcare program for a sick patient. The interactions and benefits which we can ultimately achieve is limitless.
The question of hype is one which truly valid for IoT. We need to consider how can we share and share in a way that benefits people, governments and businesses. We have the technology today to begin the sharing but in reality we also live in a society where sharing information in this way scares people. They ask why should their confidential information be shared? I say why not. To quote Mr. Spock; “Logic clearly dictates that the needs of the many outweigh the needs of the few.”. We need to consider the up side and manage the risk of sharing.

Big Data has brought with it new challenges and new problems. The security of big data in Hadoop clusters is one which is evolving on a daily basis. We have options to encrypt, to restrict and to limit who can see what but the security of Hadoop is also quite basic and requires additional products like Kerabos to help in providing some security, but in many ways Big Data is available and if someone wants to get at it they will have an easier time then trying to break into a well-secured Oracle database. Then again Oracle does have a 25 year head start on Hadoop.

So is the Internet of Things hype? I don’t think so, I think it’s more of a dream state. One where data can live together and provide value well beyond its original intention. So it’s one dream that I hope to help bring to reality.

Designing a Future-Proof Data Solution

The challenges we face today when designing solutions is how do we avoid the pitfalls of constant design changes? How can we reduce the impact to our data designs? Is it even possible?
The design of a data warehouse has been well discussed and debated over the years. The battle between Ralph Kimball and Bill Inmon over the years is legendary. The choice of an Information Factory versus a Dimensional approach continues to be one which all new data warehouses need to consider. In this discussion the choice is really immaterial. Whether you choose either design approach you will still need to consider how the design will be developed. Can we build the design incrementally? Can we minimize the impact of the overall project and minimize regression testing? This is the constant challenge we face when developing data solution at EPAM, especially when using Agile practices to drive our project success. The key is to design and develop once and to evolve the design as you go but there are some key considerations you will need to make when designing to optimize the design and minimize the refactoring which may be required by design changes.

The first item you must consider is using Design Patterns for Data Warehouse modelling techniques. This approach basically says that all objects will be built using templates which can be used to address most of the needs within your design. This means, that similar tables in the design will follow predefined patterns. At the most basic level we predefine what a dimension and a fact will look like. They will include a surrogate key and the various attributes required by each. In addition they will include control fields to allow us to manage how and when data is processed. For more complex facts or dimensions we also provide a template which allows us to support all of the different slowly changing dimensions as well as to manage quickly changing facts, both of these more advanced design methods provide us with the ability to manage the data effectively and consistently. For relationships we look to an approach where we define intermediate tables to manage relationships. We build “bridge” tables for this purpose which provide a reliable manner to relate facts and dimensions to improve performance and extend query capabilities in the future and form a key part of allowing the model to work across multiple subject areas.

The second consideration is to Design with the Future in Mind. In this situation you are faced with the choice of building based only on defined requirements with little consideration for future requirements. In the Agile context this seems like an obvious approach of design what you need when you need it. The concept of Just-in-Time Design is one which has been discussed and developed in the past few years. However when we put this practice the reality is that you want to try and define your facts and dimensions as completely as possible at the time you first design it to ensure that you design for the future needs in addition to the ones you have at the moment. This will result in additional attributes which might only be used in a much later sprint but are defined in order to reduce refactoring. In addition it may be necessary to define additional dimensions so that you will minimize the rework when it comes to adding dimensions to fact tables in the future. The key is to design what you need when you need it and provide as much forward thinking in your object definitions as early in the process as possible.

The final suggestion I would have to future-proof things is to ensure that your data warehouse is designed to support the integration of multiple data sources right from the start. So add additional attributes and ETL functionality which supports this approach. The data warehouse is really all about providing the business with an integrated and reliable solution; therefore you must design with the goal of integration from the beginning.

Ultimately the design and development of a data warehouse requires the data architect and data modeler to look to the future. They need to anticipate data requirements and to try and define that data objects and relationships as completely as possible right from the start and you can avoid the many pitfalls of a data warehouse design by designing with the end in mind while allowing the design to evolve based on business needs.

Master of Your Data using the Database
I recently was involved in a project for an organization who needed one thing. They needed a master customer and master product list to enable cross-organization analysis. This may seem like a simple task; create a single customer and product but it is not simple.

As JFK said about going to the moon, "We choose to go to the moon in this decade and do the other things. Not because they are easy, but because they are hard.", the same is true of customer and product integration and mastering. MDM as a technology and a process is not easy it is hard but it provides so much value in the end that it is worth the journey to achieve.

The challenge of MDM is focused squarely on creating a technical solution which enables the business to automate the process of matching customer and products into a single master list. This can take significant effort to get to the point where the rules you defined for matching are meaningful and effective. 

The project I was involved in required us to create a solution which was cost effective and did not include the use of a matching product like DataFlux or Trillium but was based in the database and ETL tool. Our database of choice was Oracle which provided some SQL extensions to support matching. We implemented the matching within an ETL tool (Talend) which further extended our capabilities which we had in the database. A number of functions were consider and the following Oracle functionality was used in our cleansing and matching approach:
  1. Regular Expressions where used to find patterns and remove and alter to enable a standardization of names and addresses
  2. Equi-joins and other join types to match
  3. Soundex or Metaphone function in combination with other matches to enable fuzzy matches
  4. Jaro-Winkler, Levenshtein and Distance functions for fuzzy matching
  5. ETL Tool Functionality which further extends the base database functionality
 All of these functions can help you to find the right matches in your database and provide functionality to build your own MDM solution where you can leverage the investments you have already made in your database and tool without making a huge investment in software.

I will be presenting this solution at COLLABORATE13 in Denver in April, and this entry should help you as you consider an alternative approach to matching which will be critical to your MDM solution.

Author Profile
Name: Ian Abramson
Location: Toronto, Canada

Website: Ian's Oracle Community Blog

Copyright © 2013 McGraw-Hill Global Education Holdings, LLC Any use is subject to the Terms of Use and Privacy Notice