segunda-feira, 27 de junho de 2011

Do Buzz de Terence Tao

In order to use mathematical modelling in order to solve a real-world problem, one ideally would like to have three ingredients besides the actual mathematical analysis:

* A good mathematical model. This is a mathematical construct which connects the observable data, the predicted outcome, and various unspecified parameters of the model to each other. In some cases, the model may be probabilistic instead of deterministic (thus the predicted outcome will be given as a random variable rather than as a fixed quantity).

* A good set of observable data.

* Good values for the parameters of the model.

For instance, if one wanted to work out the distance D to a distant galaxy, the model might be Hubble's law v = H D relating the distance to the recessional velocity v, the data might be the recessional velocity v (or, more realistically, a proxy for that velocity, such as the red shift), and the only parameter in this case would be the Hubble constant H. This is a particularly simple situation; of course, in general one would expect a much more complex model, a much larger set of data, and a large number of parameters. (And parameters need not be numerical; a model, for instance, could posit an unknown functional relationship between two observable quantities, in which case the function itself is the unknown parameter.)

As mentioned above, in ideal situations one has all three ingredients: a good model, good data, and good parameters. In this case the only remaining difficulty is a direct one, namely to solve the equations of the model with the given data and parameters to obtain the result. This type of situation pervades undergraduate homework exercises in applied mathematics and physics, and also accurately describes many mature areas of engineering (e.g. civil engineering or mechanical engineering) in which the model, data, and parameters are all well understood. One could also classify pure mathematics as being the quintessential example of this type of situation, since the models for mathematical foundations (e.g. the ZFC model for set theory) are incredibly well understood (to the point where we rarely even think of them as models any more), and one primarily works with well-formulated problems with precise hypotheses and data.

However, there are many situations in which one or more ingredients are missing. For instance, one may have a good model and good data, but the parameters of the model are initially unknown. In that case, one needs to first solve some sort of inverse problem to recover the parameters from existing sets of data (and their outcomes), before one can then solve the direct problem. In some cases, there are clever ways to gather and use the data so that various unknown parameters largely cancel themselves out, simplifying the task. (For instance, to test the efficiency of a drug, one can use a double-blind study in order to cancel out the numerous unknown parameters that affect both the control group and the experimental group equally.) Typically, one cannot solve for the parameters exactly, and so one must accept an increased range of error in one's predictions. This type of problem pervades undergraduate homework exercises in statistics, and accurately describes many mature sciences, such as physics, chemistry, materials science, and some of the life sciences.

Another common situation is when one has a good model and good parameters, but an incomplete or corrupted set of data. Here, one often has to clean up the data first using error-correcting techniques before proceeding (this often requires adding a mechanism for noise or corruption into the model itself, e.g. adding gaussian white noise to the measurements). This type of problem pervades undergraduate exercises in signal processing, and often arises in computer science and communications science.

In all of the above cases, mathematics can be utilised to great effect, though different types of mathematics are used for different situations (e.g. computational mathematics when one has a good model, data set, and parameters; statistics when one has good model and data set but unknown parameters; computer science, filtering, and compressed sensing when one has good model and parameters, but unknown data; and so forth). However, there is one important situation where the current state of mathematical sophistication is only of limited utility, and that is when it is the model which is unreliable. In this case, even having excellent data and knowledge of parameters may lead to error or a false sense of security; this for instance arose during the recent financial crisis, in which models based on independent gaussian fluctuations in various asset prices turned out to be totally incapable of describing tail events.

Nevertheless, there are still some ways in which mathematics can assist in this type of situation. For instance, one can mathematically test the robustness of a model by replacing it with other models and seeing the extent to which the results change. If it turns out that the results are largely unaffected, then this builds confidence that even a somewhat incorrect model may still yield usable and reasonably accurate results. At the other extreme, if the results turn out to be highly sensitive to the model assumptions, then even a model with a lot of theoretical justification would need to be heavily scrutinised by other means (e.g. cross-validation) before one would be confident enough to use it. Another use of mathematics in this context is to test the consistency of a model. For instance, if a model for a physical process leads to a non-physical consequence (e.g. if a partial differential equation used in the model leads to solutions that become infinite in finite time), this is evidence that the model needs to be modified or discarded before it can be used in applications.

It seems to me that one of the reasons why mathematicians working in different disciplines (e.g. mathematical physicists, mathematical biologists, mathematical signal processors, financial mathematicians, cryptologists, etc.) have difficulty communicating to each other mathematically is that their basic environment of model, data, and parameters are so different: a set of mathematical tools, principles, and intuition that works well in, say, a good model, good parameters, bad data environment may be totally inadequate or even misleading when working in, say, a bad model, bad parameters, good data environment. (And there are also other factors beyond these three that also significantly influence the mathematical environment and thus inhibit communication; for instance, problems with an active adversary, such as in cryptography or security, tend to be of a completely different nature than problems in the only adverse effects come from natural randomness, which is for instance the case in safety engineering.)

segunda-feira, 20 de junho de 2011

Fim do primeiro ciclo

Caros,

depois de avaliarmos o final do semestre, eu e Marcelo decidimos suspender os seminarios do LESTE
ate o inicio do proximo semestre quando voltaremos com uma nova organizacao. Quero agradecer a todos
que participaram com entusiasmo das atividades que tivemos. Espero que tenham tido a mesma alegria
que eu tive e que tenham saciado um pouco a curiosidade cientifica.

Ate breve,
Renato

quinta-feira, 16 de junho de 2011

artigo sobre revisao de artigos na Nature

"They should abandon the attitude that screams: "look, I've read it, I can be as critical as the next dude and ask for something that's not yet in the manuscript""
Nature - the world's best science and medicine on your desktop

quinta-feira, 9 de junho de 2011

terça-feira, 7 de junho de 2011

Simulações

Site interessante com diferentes tipos de simulações, como função de fourier, soma de vetores, ...

http://phet.colorado.edu/en/simulations/category/math

Destaque para a distribuição binomial, em que é possível escolher diferentes valores para n e p.

Mapnificent

Terence Tao - Buzz - Público
Another example of layering a useful graphical visualisation tool on top of an existing database: Mapnificent, which allows one to explore the geometry of the transit time metric in various cities, using data from Google Maps.


domingo, 5 de junho de 2011

Andrew Gelman: discussion of a recent article by David Spiegelhalter, Christopher
Sherlaw-Johnson, Martin Bardsley, Ian Blunt, Christopher Wood and Olivia Grigg, that is scheduled to appear in the Journal of the Royal Statistical Society:

Finally, I am glad that these methods result in ratings rather than rankings. As has been discussed by Louis (1984), Lockwood et al. (2002), and others, two huge problems arise when constructing ranks from noisy data. First, with unbalanced data (for example, different sample sizes in different hospitals) there is no way to simultaneously get reasonable point estimates of parameters and their rankings. Second, ranks are notoriously noisy. Even with moderately large samples, estimated ranks are unstable and can be misleading, violating well-known principles of quality control by encouraging decision makers to chase noise rather than understanding and reducing variation (Deming, 2000). Thus, although I am unhappy with the components of the methods being used here, I like some aspects of the output.


Extraido do blog: 

Statistical Modeling, Causal Inference, and Social Science

sábado, 4 de junho de 2011

Bebel e o blog do LESTE

Minha filha Bebel ficou muito curiosa sobre como as mensagens sao colocadas no blog.
Entao eu fiz uma pequena demonstracao.
Voila.

quinta-feira, 2 de junho de 2011

Mais visualizacao

O site do laboratorio de visualizacao de dados de Stanford é sensacional.

http://vis.stanford.edu/

Porque será que os estatísticos não estão mais empenhados nisto?
Como grupo profissional, vamos pular neste barco apenas quando
as melhores pepitas já tiverem sido extraídas? Vide Machine Learning...

Visualizacao

O LESTE comeca um novo projeto neste mes de junho: Visualizacao de dados.
E' um projeto conjunto com o DCC-UFMG e financiado pelo Min Saude.
Um lugar muito legal para começar é o blog FLOWING DATA.
http://flowingdata.com/


FlowingData | Data Visualization, Infographics, and Statistics

FlowingData | Data Visualization, Infographics, and Statistics

sexta-feira, 27 de maio de 2011

Concursos

Concurso no departamento de Física. 
O candidato sai da sala após sua apresentação. 
Um membro da banca de seleção vira-se para outro membro e diz:
- Mas este candidato, que coisa, hem? Quanta arrogância...

E o outro membro:
- Pois é, eu também gostei muito dele.

Scientist: Four golden lessons - publicado na Nature


Steven Weinberg

When I received my undergraduate degree — about a hundred years ago —
the physics literature seemed to me a vast, unexplored ocean, every
part of which I had to chart before beginning any research of my own.
How could I do anything without knowing everything that had already
been done? Fortunately, in my first year of graduate school, I had the
good luck to fall into the hands of senior physicists who insisted,
over my anxious objections, that I must start doing research, and pick
up what I needed to know as I went along. It was sink or swim. To my
surprise, I found that this works. I managed to get a quick PhD —
though when I got it I knew almost nothing about physics. But I did
learn one big thing: that no one knows everything, and you don't have
to.

Another lesson to be learned, to continue using my oceanographic
metaphor, is that while you are swimming and not sinking you should
aim for rough water. When I was teaching at the Massachusetts
Institute of Technology in the late 1960s, a student told me that he
wanted to go into general relativity rather than the area I was
working on, elementary particle physics, because the principles of the
former were well known, while the latter seemed like a mess to him. It
struck me that he had just given a perfectly good reason for doing the
opposite. Particle physics was an area where creative work could still
be done. It really was a mess in the 1960s, but since that time the
work of many theoretical and experimental physicists has been able to
sort it out, and put everything (well, almost everything) together in
a beautiful theory known as the standard model. My advice is to go for
the messes — that's where the action is.

My third piece of advice is probably the hardest to take. It is to
forgive yourself for wasting time. Students are only asked to solve
problems that their professors (unless unusually cruel) know to be
solvable. In addition, it doesn't matter if the problems are
scientifically important — they have to be solved to pass the course.
But in the real world, it's very hard to know which problems are
important, and you never know whether at a given moment in history a
problem is solvable. At the beginning of the twentieth century,
several leading physicists, including Lorentz and Abraham, were trying
to work out a theory of the electron. This was partly in order to
understand why all attempts to detect effects of Earth's motion
through the ether had failed. We now know that they were working on
the wrong problem. At that time, no one could have developed a
successful theory of the electron, because quantum mechanics had not
yet been discovered. It took the genius of Albert Einstein in 1905 to
realize that the right problem on which to work was the effect of
motion on measurements of space and time. This led him to the special
theory of relativity. As you will never be sure which are the right
problems to work on, most of the time that you spend in the laboratory
or at your desk will be wasted. If you want to be creative, then you
will have to get used to spending most of your time not being
creative, to being becalmed on the ocean of scientific knowledge.

Finally, learn something about the history of science, or at a minimum
the history of your own branch of science. The least important reason
for this is that the history may actually be of some use to you in
your own scientific work. For instance, now and then scientists are
hampered by believing one of the over-simplified models of science
that have been proposed by philosophers from Francis Bacon to Thomas
Kuhn and Karl Popper. The best antidote to the philosophy of science
is a knowledge of the history of science.

More importantly, the history of science can make your work seem more
worthwhile to you. As a scientist, you're probably not going to get
rich. Your friends and relatives probably won't understand what you're
doing. And if you work in a field like elementary particle physics,
you won't even have the satisfaction of doing something that is
immediately useful. But you can get great satisfaction by recognizing
that your work in science is a part of history.

Look back 100 years, to 1903. How important is it now who was Prime
Minister of Great Britain in 1903, or President of the United States?
What stands out as really important is that at McGill University,
Ernest Rutherford and Frederick Soddy were working out the nature of
radioactivity. This work (of course!) had practical applications, but
much more important were its cultural implications. The understanding
of radioactivity allowed physicists to explain how the Sun and Earth's
cores could still be hot after millions of years. In this way, it
removed the last scientific objection to what many geologists and
paleontologists thought was the great age of the Earth and the Sun.
After this, Christians and Jews either had to give up belief in the
literal truth of the Bible or resign themselves to intellectual
irrelevance. This was just one step in a sequence of steps from
Galileo through Newton and Darwin to the present that, time after
time, has weakened the hold of religious dogmatism. Reading any
newspaper nowadays is enough to show you that this work is not yet
complete. But it is civilizing work, of which scientists are able to
feel proud.

Seminario de 27/Maio

Pessoal,

desculpas pela postagem tao em cima da hora.
Nesta sexta (hoje!!!) teremos nosso seminario as 10 horas na sala 2076.
Infelizmente, mais uma vez, uma assembleia dos professores do depto de
estatistica atrapalha nosso horario usual e temos de nos adaptar.

HOje temos a usual cobertura semanal de uma revista (a CSDA, Computational Statistics
and Data Analysis) com dois artigos em destaque e em seguida uma visita de longe.
Nossa ex-aluna Thais Paiva, atualmente aluna do doutorado em estatistica na Duke
University nos EUA fara' o show and tell de hoje. Ela vai contar um pouco de sua experiencia
num dos melhores deptos de estatistica do mundo.

Os dois papers que serao apresentados sao os seguintes:

"Dimensionality reduction when data are density functions"
P. Delicado e apresentado pela Erica Castilho 

e "Least squares estimation of nonlinear spatial trends", de 
Rosa M. Crujeiras, Ingrid Van Keilegom apresentado pela Marcia Barbian,
ambos de 2010. 

Nos vemos por la'.
Renato

sexta-feira, 20 de maio de 2011

Novo ciclo de seminários

Pessoal, segue a nossa nova agenda de seminários.

27/05
Journal: Computational Statistics and Data Analysis
Estatística não espacial: Érica
Estatística Espacial: Márcia
Show and Share: a definir

03/06
Journal: Statistics in Medicine
Estatística não espacial: Fábio Silva
Estatística Espacial: Ilka
Show and Share: Ramiro

10/06
Journal: Journal of Graphical and Computational Statistics
Estatística não espacial: Fábio Demarqui
Estatística Espacial: Marcelo
Show and Share: a definir

17/06
Journal: Technometrics
Estatística não espacial: Sérgio
Estatística Espacial: Aline
Show and Share: a definir

01/06
Journal: Bayesian Analysis
Estatística não espacial: Edna
Estatística Espacial: Rosângela
Show and Share: a definir

quinta-feira, 19 de maio de 2011

Seminario de sexta, 20 de maio

Pessoal,
amanha, as 15 horas teremos o seminario do LESTE na sala 2076, cobrindo o periodico
Journal of Machine Learning Research e o periodico Machine Learning. Sera uma novidade na
estatistica dar uma olhada nestes periodicos.

Se quiser adiantar:
http://jmlr.csail.mit.edu/
e
http://www.springer.com/computer/ai/journal/10994

Vamos discutir os seguintes artigos:
Clustering Algorithms for Chains de A. Ukkonem (2011)

É uma proposta para resolver o problema de, dado um conjunto de vetores ordenados,
efetuar de forma eficiente uma clusterização (separação dos vetores em grupos com características semelhantes sob certos critérios, de forma eficiente).
O artigo encontra-se neste link.

O outro artigo e' 

A multivariate Bayesian scan statistic for early event detection and characterization

From the issue entitled "Special Issue on Machine Learning Algorithms for Event Detection
O artigo encontra-se neste link.

E teremos, claro, nosso show and share. Preparem o que tiverem para mostrar e repartir.





sábado, 7 de maio de 2011

Pacotes do R

 A program run on 3/24/2011 counted 4,338 R packages at all major repositories, 2,849 of which were at CRAN.


Ta tudo dominado

A measure of software popularity is the number of other web pages that contain links that point to the software’s main web site. The figure provides those numbers, recorded using Google on March 19, 2011.



Videos on Data Analysis with R

O post abaixo veio do Blog de Jeromy Anglim

Videos on Data Analysis with R: Introductory, Intermediate, and Advanced Resources

If you want to learn about R through videos, there are now a large number of options. This post provides links to many of these video under the headings of: (a) What is R? (b) Introductory R, and (c) Intermediate and Advanced R.

What is R?

If you are evaluating whether you should learn and use R, these videos explain what R does and why it is worth learning.
  • David Smith from REvolutions provides an Introduction to R.
  • Dice News presents a two minute quick overview of R
  • Courtney Brown presents a 7 minute video on why he encourages his students to learn R.
  • J.D. Long presents an hour long video conference on how he came to use R.

Introductory R

If you have just started learning R, these videos show you how to do basic data analytic tasks in R.
  • Rvideos provides around 20 videos providing an introduction to R on topics such as bootstrapping, ANOVA, Graphs, Dates, Sampling, Help, and more.
  • ramstatdavid has around 25 videos on YouTube each of around 5 minutes. Many of the videos are on producing plots in R.
  • LearnViaWeb has around 15 introductory R videos. Topics include loops, time series, installing R, reading and writing data, writing a function, GLM, and using random numbers.
  • Decision Science News has two ten minute video tutorials providing an introduction to R.
  • wildscop on YouTube has eight introductory tutorials on R. Topics include installing R, using help, opening files, interacting with the console, manipulating matrices, and plotting.
  • ahmetrics on YouTube has five tutorials on R on topics such as trellis plots, reading Excel data, simple linear regression
  • Learning R Toolkit has a six module course on R with videos. The first two modules are free.
  • regionomics on YouTube has around 5 videos on R interspersed with videos on other topics. Topics include an introduction to R, Rcmdr, and Discrete Choice models.
  • Dr. Thomas MacFarland has a series of introductory lectures on R using the Tegrity viewer. It assumes very little prior knowledge of computing or statistics. When I tried it, Tegrity worked better in Firefox than in Chrome.
  • off2themovies2 on YouTube has around 20 introductory tutorials on R
  • mattstat on YouTube has 7 introductory tutorials on R
  • MrIanfellows on YouTube has 3 videos providing examples of using the Deducer GUI to interact with R.
I thank Rob Hyndman for helping me discover a few of the above.

Intermediate and Advanced Materials on R

If you know the basics of R and are interested in learning about a particular topic related to R, the following videos are great. I find it useful to hear how experts talk about R in order to get a sense of how to speak and think about R.
  • Drew Conway makes available a set of videos through VCASMO. The talks are drawn from a range of presenters. The list is growing and at the time of posting includes talks on: ggplot2, Social Network Analysis in R, SQL and R, Python and R, optimisation and R, Zelig, and more.
  • Hadley Wichkam has a four part video short course on graphics with R. It is particularly good if you want to learn ggplot2
  • David Mease provides a complete video course on Data Mining using R and Excel
  • McStatsTutorials on YouTube has 6 videos on Time Series. A couple of videos demonstrate how to perform analyses using R.
  • For my own videos, see the Video Tag of my blog. For example, a 34 minute video analysing Winter Olympic Medal data using StatET and R..

Related Resources

If you enjoy learning about data analysis through videos, you may also want to have a look at my earlier post on full-length mathematics and statistics video courses. These are great if you need to consolidate more fundamental mathematical and statistical concepts. I also have a post with links on Getting Started with R.
If there are links to videos on R that I have left off the list, feel free to post a link in the comments.

sexta-feira, 6 de maio de 2011

A outra...

A outra apresentacao de hoje, sexta, dia 06 de maio, será da Rosângela Loschi,



e ela vai apresentar o paper do JRSS-B de 2010


Hybrid Dirichlet mixture models for functional data
Sonia Petrone,
Bocconi University, Milan, Italy
Michele Guindani
University of New Mexico, Albuquerque, USA
and Alan E. Gelfand
Duke University, Durham, USA

Summary.
In functional data analysis, curves or surfaces are observed, up to measurement
error, at a finite set of locations, for, say, a sample of
individuals. Often, the curves are
homogeneous, except perhaps for individual-specific regions that provide heterogeneous
behaviour (e.g. ‘damaged’ areas of irregular shape on an otherwise smooth surface). Motivated
by applications with functional data of this nature, we propose a Bayesian mixture model, with
the aim of dimension reduction, by representing the sample of
curves through a smaller set of
canonical curves.We propose a novel prior on the space of probability measures for a random
curve which extends the popular Dirichlet priors by allowing local clustering: non-homogeneous
portions of a curve can be allocated to different clusters and the
individual curves can be
represented as recombinations (hybrids) of a few canonical curves. More precisely, the prior
proposed envisions a conceptual hidden factor with
-levels that acts locally on each curve.
We discuss several models incorporating this prior and illustrate its performance with simulated
and real data sets. We examine theoretical properties of the proposed finite hybrid Dirichlet
mixtures, specifically, their behaviour as the number of the mixture components goes to
1and
their connection with Dirichlet process mixtures.
Keywords
: Bayesian non-parametrics; Dependent random partitions; Dirichlet process;Finite mixture models; Gaussian process; Labelling measures; Species sampling priors

GEOMED 2011


Uma daquelas conferencias que vale a pena:
excelentes conferencistas num lugar que e' um paraiso na Terra.
Submissoes para artigos de contribuicao estao abertas.
-------------------------------------------------------------------------------------------------
GEOMED 2011 is the 7th international, interdisciplinary conference on spatial statistics and geomedical systems.  GEOMED brings together statisticians, geographers, epidemiologists, computer scientists, and public health professionals to discuss methods of spatial analysis related to issues arising in public health.

On behalf of the organizing committee, we welcome GEOMED to Canada. The meeting will be held in Victoria, British Columbia, a beautiful Canadian destination.

***Abstract submission for contributed posters is now open***
===================

Conference Dates

===================

October 20th - October 22nd, 2011 ~ Inn at Laurel Point, Victoria, British Columbia, Canada

Detailed information is available at:

http://geomed2011victoria.com/

============================

Contributed Poster Session

============================

The contributed poster session will be held on Thursday October, 20th, 2011.

GEOMED 2011 is now accepting abstract submissions on a host of topics relating to spatial statistics, geography and public health, including:

.     Pattern Detection and Public Health Surveillance

.     GIS and Public Health

.     Disease Mapping

.     Spatial Statistics for Neuroimaging

.     Spatial Statistics for Environmental Epidemiology

.     Spatial Point Processes and Applications

.     Health Geography and Spatial Data

Abstracts may be submitted at:

http://geomed2011victoria.com/abstract-submission/

A current list of confirmed speakers is available at:

http://geomed2011victoria.com/invited-speakers/

=======================

Conference Proceedings

=======================

Two special issues of Statistical Methods in Medical Research (SMMR) have been reserved for refereed papers from this meeting. The first issue will focus on general disease mapping, and the second issue will focus on spatial statistics for neuroimaging.  In addition, an issue of the new Elsevier journal Spatial and Spatio-Temporal Epidemiology (SSTE) will be set aside for refereed papers with a more epidemiological focus.

Sincerely,

Farouk Nathoo and Charmaine Dean, Program Co-Chairs

quarta-feira, 4 de maio de 2011

Nosso proximo seminario vai apresentar este artigo do JRSS-Series B.
A apresentadora sera a profa Edna Reis, amante dos gatos. 
Nao, os gatos nao estarao presentes no seminario.


Voltamos ao nosso bat-horario usual de 15 horas, na mesma bat-caverna.
Inte'
Renato
PS: nao terei tempo de colocar o paper para download, que quiser, escreva para 
edna@est.ufmg.br ou para mim.

Education's not the answer, after all

By PAUL KRUGMAN
THE NEW YORK TIMES
March 7, 2011, 7:34PM




It is a truth universally acknowledged that education is the key to
economic success. Everyone knows that the jobs of the future will
require ever higher levels of skill. That's why, in an appearance last
week with former Florida Gov. Jeb Bush, President Barack Obama
declared that "If we want more good news on the jobs front then we've
got to make more investments in education."

But what everyone knows is wrong.

The day after the Obama-Bush event, the New York Times published an
article about the growing use of software to perform legal research.
Computers, it turns out, can quickly analyze millions of documents,
cheaply performing a task that used to require armies of lawyers and
paralegals. In this case, then, technological progress is actually
reducing the demand for highly educated workers.

And legal research isn't an isolated example. As the article points
out, software has also been replacing engineers in such tasks as chip
design. More broadly, the idea that modern technology eliminates only
menial jobs, that well-educated workers are clear winners, may
dominate popular discussion, but it's actually decades out of date.
The fact is that since 1990 or so the U.S. job market has been
characterized not by a general rise in the demand for skill, but by
"hollowing out": Both high-wage and low-wage employment have grown
rapidly, but medium-wage jobs — the kinds of jobs we count on to
support a strong middle class — have lagged behind. And the hole in
the middle has been getting wider: Many of the high-wage occupations
that grew rapidly in the 1990s have seen much slower growth recently,
even as growth in low-wage employment has accelerated.

Why is this happening? The belief that education is becoming ever more
important rests on the plausible-sounding notion that advances in
technology increase job opportunities for those who work with
information — loosely speaking, that computers help those who work
with their minds, while hurting those who work with their hands.
Some years ago, however, the economists David Autor, Frank Levy and
Richard Murnane argued that this was the wrong way to think about it.
Computers, they pointed out, excel at routine tasks, "cognitive and
manual tasks that can be accomplished by following explicit rules."
Therefore, any routine task — a category that includes many
white-collar, non-manual jobs — is in the firing line. Conversely,
jobs that can't be carried out by following explicit rules — a
category that includes many kinds of manual labor, from truck drivers
to janitors — will tend to grow even in the face of technological
progress.

And here's the thing: Most of the manual labor still being done in our
economy seems to be of the kind that's hard to automate. Notably, with
production workers in manufacturing down to about 6 percent of U.S.
employment, there aren't many assembly-line jobs left to lose.
Meanwhile, quite a lot of white-collar work currently carried out by
well-educated, relatively well-paid workers may soon be computerized.
Roombas are cute, but robot janitors are a long way off; computerized
legal research and computer-aided medical diagnosis are already here.

And then there's globalization. Once, only manufacturing workers
needed to worry about competition from overseas, but the combination
of computers and telecommunications has made it possible to provide
many services at long range. And research by my Princeton colleagues
Alan Blinder and Alan Krueger suggests that high-wage jobs performed
by highly educated workers are, if anything, more "offshorable" than
jobs done by low-paid, less-educated workers. If they're right,
growing international trade in services will further hollow out the
U.S. job market.

So what does all this say about policy?

Yes, we need to fix American education. In particular, the
inequalities Americans face at the starting line — bright children
from poor families are less likely to finish college than much less
able children of the affluent — aren't just an outrage; they represent
a huge waste of the nation's human potential.

But there are things education can't do. In particular, the notion
that putting more kids through college can restore the middle-class
society we used to have is wishful thinking. It's no longer true that
having a college degree guarantees that you'll get a good job, and
it's becoming less true with each passing decade.

So if we want a society of broadly shared prosperity, education isn't
the answer — we'll have to go about building that society directly. We
need to restore the bargaining power that labor has lost over the last
30 years, so that ordinary workers as well as superstars have the
power to bargain for good wages. We need to guarantee the essentials,
above all health care, to every citizen.

What we can't do is get where we need to go just by giving workers
college degrees, which may be no more than tickets to jobs that don't
exist or don't pay middle-class wages.

Krugman is a columnist for The New York Times.