Monthly Archives: September 2013
- Ordinal data can be represented by numbers that show their order as well as text values that have more meaning. For example, enter high, medium, and low stored as 3, 2, 1 respectively. Now their natural order is preserved, but also the more descriptive text is present, too. The variable with text labels can be analyzed either as categorical or continuous.
- Easy data entry. The numeric associations can easily be modified and become a shortcut in data entry, i.e., when typing in the data, I can type in 1 and that value will automatically show Low from my text labels.
When a variable is selected for analysis as a continuous variable (in basic descriptive statistics for example) and that variable has text labels, the following warning dialog box is displayed.This does not mean the analysis can’t proceed. It simply brings to your attention the fact that the analysis you are about to perform may be suspect. Consider the previous example where 1 to 3 represent low to high. We can compute a mean, standard deviation, etc., on this data because the numbers 1 to 3 are used in the mathematical formulas. This warning dialog box prompts you to examine if this analysis makes sense with the data you have selected. If so, select the option to continue. If not, you can further explore the variables containing text labels with the Scan Spreadsheet option.
In numeric data, suppose you inadvertently typed in some text, or on import, perhaps the row of variable names were incorrectly read in as the first row of data. Now, a text label and number combination is used in this column. Deleting the offending case is only one step in fixing this issue. The text label, although not used, is still there. This will cause the warning dialog above to be displayed in analysis. The software does not know that the text label was a mistake. Using the Text Label Editor, the unwanted text label can be removed.
Another potential problem stemming from accidental text labels is unexpected text popping up in your numeric data. Because of a data entry or import error, a number is assigned a text label. Now, when that number naturally occurs in the data, the number is hidden by the unwanted text label. The root cause of the issue and the fix are the same, but the symptoms are different.
One final possible symptom is unexpected values in graphs and analyses.This plot shows what happens when numeric data, on a scale of 0 to 1, are plotted in a histogram, but one case has an unexpected text label. The numeric value associated with the text in this graph is the default 101. The data look skewed, as a very extreme outlier is present. This is simply a data entry error that is masked by text labels. Using the Text Label Editor, you can further explore this error.
by Win Noren on Wednesday, August 21, 2013 1:42 PM
I read an interesting article the other day: The Cloud Begins With Coal: Big Data, Big Networks, Big Infrastructure, and Big Power – an overview of the electricity used by the global digital ecosystem. I must admit that until reading this (lengthy) article I never gave much thought to the electrical consumption of our ever-expanding digital world.
According to this paper, the Information-Communications-Technologies (ICT) ecosystem consumes almost 10% of world electricity generation and 50% more energy than global aviation. Even more surprising to me was that streaming an hour of video content weekly to my smartphone or tablet will consume more electricity in a year than is consumed by two new refrigerators! Beyond noticing that my cell phone needs to be charged, I never thought about the electrical cost to deliver content to my devices.
Soon hourly Internet traffic will exceed the annual Internet traffic in 2000. This digital traffic is distributed by an electricity-consuming infrastructure. According to the Digital Power Group, coal is the world’s largest source of electricity currently supplying 40% of the global electricity which is why they state that “the digital universe and Cloud begins with coal.”
The paper goes on to remind us that “digital bits are electrons…[and that] astronomical quantities of data eventually add up to real power in the real world.” In fact, according to Greenpeace1, “If the Cloud were a country, it would have the fifth largest electricity demand in the world,” coming after only the US, China, Russia, and Japan but before India.
Obviously data centers are large consumers of electricity and for many the cost of buying computer servers is less than the cumulative cost of the electricity consumed by those servers in their four-year life span. Facebook opened a data center in 2012 in North Carolina where electric rates are 10-30% below the national average and Facebook projects that it will save $100 million in operating costs because of the lower electrical costs. It is also projected that this Facebook facility will use one million tons of coal over the next decade. Similarly a huge data center under construction in China advertises cheap power, not cheap labor, as their competitive advantage.
So, of course, the obvious question is how will the ever-growing need for power be met? I was aware that our growing world has a growing demand for electricity but I never had considered the role that moving bits of digital data play in this…..Hey, did you see that funny comedy video by Michael Jr?
1 Greenpeace International, How Clean is your Cloud, April 2012