Saturday, October 27, 2018

#el30 Data and Models

I should be grading student documents this morning, but I'm thinking about #el30. I may have an assessment of that next week.

Anyway, as I was reading some posts about Data, I was struggling with our previous discussion about the differences between human and machine learning, when something that AK wrote sparked some coherent ideas (at least dimly coherent for my part). AK said: "This got me thinking about the onus (read: hassle) of tracking down your learning experiences as a learner. ... As a learner I don't really care about tracking my own learning experiences."

I thought, no, I, too, don't want to track all my learning experiences. Tracking all those experiences would take all my time, leaving no time for more learning, much less time for grading my students' papers. So maybe computers can be useful for tracking my learning experiences for me? A computer can attend me--say, strapped to my wrist, in my pocket, or embedded in my brain--and collect data about whatever my learning experiences are. After all, computers can collect, aggregate, and process data much faster than I can, and as Jenny notes, computers don't get tired.

But what data does a computer identify and collect? Even the fastest computer cannot collect all the bits of data involved in even the simplest learning task. How will the computer know when I'm learning this and not that? Well, the computer will collect the data that some human told it to collect. Can the computer choose to collect different data if the situation changes, as it certainly will? Perhaps. But again, it can only ever collect a subset of data. How will it know which is the relevant, useful subset? The computer's subset of data may be quantitatively larger than my subset, but will it be qualitatively better? How might I answer that question?

Turning experience into data is a big issue, and I want to know how the xAPI manages it. Making data of experience requires a model of experience, and a model always leaves out most of the experience. The hope, of course, is that the model captures enough of the experience to be useful, but then that utility is always tempered by the larger situation within which the learning and tracking take place. Can a computer generate a better model than I can? Not yet, I don't think.

If both the computer and I are peering into an infinity of experience, and I can capture only about six feet in data while the computer can capture sixty feet, or even six hundred feet, we are both still damned near blind quantitatively speaking. Reality goes a long way out, and there is still something about constructing models to capture that reality that humans have to do.

I've no doubt that computers will help us see farther and wider than we do now, just as telescopes and microscopes helped us. I've also no doubt that computers will help us analyze and find patterns in that additional data, but I'm not yet convinced that computers will create better models of reality without us. When I see two computers arriving at different views of Donald Trump and arguing about their respective views, then I might change my mind.

The #MeToo Text: From Documents to Distributed Data #el30

This week's Electronic Learning 3.0 task is about distributed data, and it gives me a way to think about the #MeToo document that has occupied me for the past year and that has been the topic of several posts in this blog. In short, I take the #MeToo text (all several million tweets of it and more) to represent a new kind of distributed document that is emerging on the Net. Thus, it may be a manifestation of the kind of shift in how we handle data that Downes discusses.

Downes introduces his topic this way:
This week the course addresses two conceptual challenges: first, the shift in our understanding of content from documents to data; and second, the shift in our understanding of data from centralized to decentralized. 
The first shift allows us to think of content - and hence, our knowledge - as dynamic, as being updated and adapted in the light of changes and events. The second allows us to think of data - and hence, of our record of that knowledge - as distributed, as being copied and shared and circulated as and when needed around the world.
I teach writing--both the writing of one's own and the writings of others--which since the advent of Western rhetoric in Greece some three thousand years ago has focused on centralized documents. By that I mean that the function of a document (this blog post, for instance, or a poem or report) was to gather data, organize that data into a format appropriate for a given rhetorical situation, and then present that data in a single spoken or written text. This is generally what I teach my students to do in first-year college composition. This is what I'm trying to do now in this blog post. This is, at least in part, what Downes has done in his Electronic Learning 3.0 web site. Most Western communications has been built on the ground of individual documents or a corpus of documents (think The Bible, for instance, or the Mishnah or the poems of John Berryman).

This idea of a centralized document carries several assumptions that are being challenged by the emergence of distributed data, I think. First, the Western document assumes a unified author--either a single person or a coherent group of people. Western rhetoric has a strong tendency to enforce unity even where it does not exist (think of the effort to subsume the different writers of The Bible, for instance, under the single author God). The Western notion of author-ity still follows from this notion of a single, unified author, and the value and success of the document depends in great part upon the perceived authority of this author.

Along with a single, unified author, the Western document assumes a unity within itself. A document is supposed to be self-contained, self-sufficient. It is supposed to include within it all the data that is necessary for a reader to understand its theme or thesis. I don't believe that any document has ever been self-sufficient, but this is the ideal. A text should be coherent with a controlling theme (poetic) or thesis (rhetoric). The integrity and value of the text is measured by how well the content relates to and supports the theme or thesis.

And of course, a document should have a unity of content. It should have a single narrative, a single experience, a single argument. Fractured, fragmented narratives bother us, and they never make the best-seller lists. Incoherent arguments seldom get an A or get published.

There may be other unities that I could mention, but this is sufficient to make my point that we have a long history of aggregating, storing, and moving data in documents with their implied unities. And then along comes #MeToo: a million tweets and counting over days, weeks, and months. We have this sense that surely #MeToo is hanging together somehow, but is it really a single text?

Well, not in the traditional sense. It has no unified author. Just when we thought that Alyssa Milano started it, we learn that some other woman, Tarana Burke, used the phrase ten years ago. #MeToo isn't even a unified group. A million women are not a unified group. It has no unified thesis. It isn't even an argument. There is no dialectic or rationale. It has no unified content. We think it does because of the single hash tag, but each woman brings a unique set of experiences to her tweet: some have a leer or catcall, some gropings, others rapes or years of beatings. All of them have something different, something unique. They cover the gamut, the field, the space.

#MeToo is a swarm, and we really don't like swarms. Who's speaking here, to whom, and about what? What's the point? And what kind of document is this? How do I read it? How do I respond?

#MeToo is a rhizome, a fractal, and I'm thinking we will come to write and to read this way. We will think this way. Perhaps we always have, and our documents obscured that for us. #MeToo makes explicit a million neurons firing.

And finally, I must recognize that #MeToo could neither have been written nor read without our technology. This way of knowing, thinking, and expressing is possible only with help--in this case, Twitter to write it and somewhat read it--though reading millions of tweets is rather impossible for a single human to do. We need the data analysis powers of our computers to even approach a comprehensive reading of #MeToo. We need something like Valentina D'Efilippo's reading strategies and tools in her article "The anatomy of a hashtag — a visual analysis of the MeToo Movement".

I'm wondering, then, what happens when not only data is distributed and decentralized, but when documents themselves become distributed and decentralized. Is this fake news?

Monday, October 22, 2018

Being Human among Computers: #el30

With a number of other online colleagues, I'm starting a new MOOC with Stephen Downes entitled "E-Learning 3.0". According to Stephen's introduction:
This course introduces the third generation of the web, sometimes called web3, and the impact on e-learning that follows. In this third generation we see greater use of cloud and distributed web technologies as well as open linked data and personal cryptography.
The first week featured a Google Hangout between Stephen in Canada and George Siemens in Australia. I've posted the video here, starting it about seven-and-a-half minutes in to avoid the setup issues.

As Jenny Mackness notes in her blog post about the conversation, Siemens and Downes wax philosophical in their conversation, centering "around what it means to be human and what is human intelligence in a world where machines can learn just as we do."

While I understand the fascination of such a question as computer technologies increasingly approximate many of our intellectual capabilities, in some ways the question seems moot. For me, part of what it means to be human is to use tools and technologies that enhance our innate human capabilities. Admittedly, most of our early tools enhanced our physical capabilities, making us stronger and faster and warmer, but from the beginning, we created technologies that enhanced our intellectual capabilities. I think of language as a technology, and I am not yet convinced that computers will change us more than language in both spoken and written forms has already done. I can almost see computers as a refinement and extension of language, which started with speech, eventually developed into writing—making marks also led to math and drawings—and is being expressed now through computers. Few things distinguish us from other life forms as much as our tools and technologies do.

Did Shakespeare write Hamlet or did the English language? Well, both actually.

Part of the fascination of this question about human vs. computer intelligence comes from our apprehension that computers will become more powerful than we are. This is an old fear, as the American folk tale of John Henry demonstrates, but for me, the lesson of John Henry is that we will continue to use computers to make us smarter despite our fears. I suppose the fearful prospect is if computers will use us to make themselves smarter or if they will simply come to ignore us, having become so smart themselves that our abilities add nothing to them. I don't think they will destroy us; rather, they'll abandon us. This is a problem mostly if you think that humans are the smartest thing in the universe and that computers will usurp our position. It seems rather chauvinistic to think that humans are the crowning achievement in this wondrously large and varied universe. The odds are surely against it, I think.

Almost all complex systems that I know about can learn: taking in information from the ecosystem, processing that information, making structural adjustments to better fit to their environments, and then feeding back information into the ecosystem, which likewise is trying to make a better fit for itself. I have no doubt that computers will do the same, and if our ecosystem comes to include smart machines, then we and the rest of the ecosystem will have to adapt to those new entities. The universe will manage that adaptation quite nicely and count itself more advanced for it.

But that's the long game. In the short game, I am keen to explore how smart machines can help me and my students learn differently, maybe better.