During our launch last week we discussed the amazing benefits of XtremIO’s In-Memory Metadata architecture. Some have seen fit to FUD this approach as risky – what happens to valuable metadata if a controller (or power) fails and memory contents are lost? Believe it or not, we did think about these things when designing XtremIO.So let us clear the air – in-memory metadata is a run-time capability of the array that significantly boosts performance. Metadata is not exclusively kept in memory. It is also journaled, protected, and hardened to SSD and can tolerate any failure event in the array. We couldn’t cover every detail of this during a one-hour launch event, so here’s what we didn’t have time to say last week.
To set the foundation, let’s briefly review the concept of metadata. In the context of storage systems, metadata is simply useful internal information managed by the array…
View original post 1,682 more words
A guest post by Tamir Segal, Senior PM
Snapshots is a technology that enables creating copies of volumes. XtremIO has developed a snapshot embedment that provides the ability to create space efficient snapshots that are managed as standard volumes in the cluster and have the same performance capabilities and data services as production volumes. In this post I provide a general explanation on legacy snapshot implementation, and explain what makes an XtremIO snapshot different.
Legacy implementation of snapshot was based on technology called Copy-On-First-Write. The idea of Copy-On-First-Write is that once a snapshot is created, a new storage pool is defined in the system; every write to the production volume triggers a data movement operation to the snapshot pool:
· A new write is received by the system on LBA X
· The system reads the original data on LBA X from the production volume
· The system…
View original post 2,316 more words
By Dr. Alon Grubshtein, Principal Data Scientist — EMC IT
This is a great time to be a data scientist –a bit like rock stars with all the fans always trying to catch some private time with us. While there’s is no clear definition of what a data scientist is (see related blog or view diagram of DS skillset) our take on this role is quite simple:
- Work with stakeholders to elevate high impact business related questions
- Find the means to answer these questions
This blog aggregates our collective experiences as members of EMC’s Corporate IT Data-Science-as-a-Service (DSaaS) team. Our team has been active since 2012, providing Data Science (DS) services to different business units as part of EMC IT’s transformation to an agile and innovative IT-as-a-Service model.
Although we aimed for a technical blog, we thought that the first post should provide a broader context to the…
View original post 1,106 more words
Because of the many discussions and confusion around the topic of partitioning, disk alignment and it’s brother issue, ASM disk management, hereby an explanation on how to use UDEV, and as an extra, I present a tool that manages some of this stuff for you.
The questions could be summarized as follows:
- When do we have issues with disk alignment and why?
- What methods are available to set alignment correctly and to verify?
- Should we use ASMlib or are there alternatives? If so, which ones and how to manage those?
I’ve written 2 blogposts on the matter of alignment so I am not going to repeat myself on the details. The only thing you need to remember is that classic “MS-DOS” disk partitioning, by default, starts the first partition on the disk at the wrong offset (wrong in terms of optimal performance). The old partitioning scheme was invented when physical…
View original post 4,415 more words
First off, my apologies for delaying the last part of this four part blog for so long. I have been building a fully automated application platform as a service product for EMC IT to allow us to deploy entire infrastructure stacks in minutes – all fully wired, protected and monitored, but that topic is for another blog.
In my last post,Best Practices For Virtualizing Your Oracle Database With VMware, the best practices were all about the virtual machine itself. This post will focus on VMware’s virtual storage layer, called a datastore. A datastore is storage mapped to the physical ESX servers that a VM’s luns, or disks, are provisioned onto. This is a critical component of any virtual database deployment as it is where the database files reside. It is also a silent killer of performance because there are…
View original post 909 more words
Oracle Big Data Lite Virtual Machine provides an integrated environment to help you get started with the Oracle Big Data platform. Many Oracle Big Data platform components have been installed and configured – allowing you to begin using the system right away.
Oracle has provided a single VM that contains everything you need to get familiar with its Big Data platform – with it being updated to Oracle 12c components, it’s exactly what I have been waiting for, including videos and tutorials on integration with Hadoop and NoSQL. If you have a dual-core, 8GB RAM laptop with 50GB free disk space, all you then need is Oracle VM VirtualBox to run the VM and 7-zip to extract the contents of the first BigDataLite-3.0.001 file. This will create a BigDataLite-3.0.ova VirtualBox appliance file, double-click that and up will come VirtualBox. Once you start the VM, login as oracle/welcome1
Oracle has provided a Deployment Guide with these and further details.
I am fascinated to see how Oracle Corporation is responding to the challenge of the new Big Data and NoSQL vendors, that journey for me started here.