Archives for the month of: March, 2013

Before taking this class I’d never even heard of, let alone worked with the Python programming language or Komodo Edit. In fact, other then a bit of minor programming that I used in high school, I have mostly steered clear of it for lack of understanding and failing to see how knowing how to write programming language could possibly benefit me as a History major. However, having worked through the lessons found on the Programming Historian website, I found that there are actually ways to manipulate texts using the Python language to assist in researching .

As computers are able to process information many times faster than a human can, using computers to assist in researching can be a very good time saving tool if you know what you are doing. Working through the lessons in the Programming Historian modules allowed me to create a program that would cut out stop-words, punctuation and similar unnecessary information in order to create a list of word frequencies that allowed me to get a general idea of what the article was about. Is this really useful though as such tools already exist without having to do the programming yourself? Perhaps not, but it was still certainly interesting to see how these tools work and to learn that I could also create such tools with a little work.

The biggest problem I found working with the  Programming Historian was the layout of the lessons. Sometimes while working through the lessons they do not specify where to enter parts of the code which I found confusing as my codes would not work despite following them exactly. For the most part I was able to find out my mistakes later on in the lesson as it became clear what I’d doe wrong but it would have been nice if the lessons were more specific as you work through it rather then waiting to the end to clarify what you’re doing. Ultimately I was only able to complete lesson five before getting completely stuck on lesson six to the point where I could not figure out where I’d gone wrong. I even went back over the line of code that created the error message and compared it to the code in the lesson to no avail. Perhaps if I was better versed in the Python programming language I could’ve caught my own mistake.

Ultimately I found Python very interesting and it was cool to see what you could create yourself using the programming language and it is definitely a very useful tool. However in the end I found the Programming Historian confusing and aggravating to use.

heroic-computer-problems-19172-1269474421-73

 

As historians, often times when we think of computer programming we don’t associate it with history. However, we have seen that this is not necessarily true, due to the advancement of technology over the years. In ancient societies this notion of computer programming was displayed in the most basic sense to keep record of their history. As time progressed and human beings became eager for knowledge and power, we began to shift our methods of record keeping from man made to artificial intelligence. This resulted in the birth of computer programming. Through the use of various programming tools such as Java, Python and Smalltalk, we were able to see how computer programming has been used to further historical research. More specifically in terms of Python, the program itself could be used by the average person to organize digital information. This was illustrated through the “Hello World” program. On the other hand, the one drawback from Python would be the amount of accuracy needed to achieve the result desired. For example, the placement of certain characters are key to the end product.

how-to-solve-computer-problem

I personally had a lot of difficulty and felt limited when setting up this program. Being someone that felt they had a solid computer background, I was constrained the basic mechanics of the program which in turn lead to a lot of frustration. However, with the use of Digital Historian I was guided through the installation process while learning how to use the program itself. From my perspective this program is a great tool for beginning web designers, people within this field or historical research. I personally cannot see myself using this program beyond this class. Overall, Python was an excellent learning experience and opened my eyes to another aspect of digital history.

 

Let me start this off by saying this is not the first experience I’ve had with coding or with using terminal-based applications. Despite being a History major, I’ve always had quite a computer bent. I didn’t go into Computer Science because I can’t grasp advanced math, but working with computers is something I’m very comfortable with. So, I probably had a very different experience than some of the others in this course.

With that aside, to the question of this post: is coding a valuable thing for Historians to know how to do and is The Programming Historian a good way of teaching it? The answer to that question would be “yes” and “for the most part”.

Computers can process information way faster than a person can and coding can be a very quick and efficient way to get the information you need from a large pool of sources. However, it can be very intimidating to someone new to it, especially when things don’t go as planned and your computer starts spitting out error messages that don’t always make too much sense to people unfamiliar to computer terminology. The Programming Historian is pretty good at introducing historians to coding and what it can do for them, but there are some problems with it.

One of the biggest problems I have with The Programming Historian is that it doesn’t really ever tell you much about troubleshooting errors, something that any coder will absolutely encounter. Nobody, not even the world’s greatest coder, will always write perfect code on the first try. It could be something as simple as a typo, but you will still get an error message. Understanding the basics of a code like Python is great, but if you have no idea what to do if an error pops up, then you are not going to get very far as a coder. The Programming Historian has virtually no reference to errors and troubleshooting. While I have had enough experience with computers to be able to decipher the root of many error messages, other people have not. So, they are potentially left completely stuck when even an error message shows up. Sure, you could just download the proper code at the end of the lesson, but that doesn’t really solve the problem. Knowing what a piece of code does is good. Knowing how it does it is much, much better.

The Programming Historian also has a tendency to not explain how pieces of code work, particularly in the later, more complex lessons. It explains what a certain block of code is going to do, but not exactly how. You are often left writing certain functions that you have no idea what they mean, just what the finished product will end up doing. Again, as I said, knowing what a piece of code does is good, but knowing how it does it is better.

Programming can be very useful for a historian trying to work with digital sources and is a good thing to learn. The Programming Historian is also a good source for teaching those without a background in the field how coding can be useful for them, but it does leave some things, like knowledge on troubleshooting errors, to be desired.

The way we document our history has changed over the years, from cavemen symbolism on walls to present day computer data basis and archives. We have learnt the significance that textual analysis has had on history more specifically, “the study of recorded human communications.” Through the use of books, websites, paintings etc we have been able to document our findings on earth as well as understand human development and communication skills.  This was displayed via use of various text analysis tools such as: Voyant, Wordle and N-Gram Viewer.   These tools able us to better understand how important text is to the development of human history, technology and digital history archives as we saw from our first couple post. By using the above tools we are able to find trends in various writings and understand commonalities among them. For Example I will be using both Voyant and Wordle as my primary tools in my final project; seeing that I will be creating a blog/website analysing lyrical trends in Hip Hop music from the 1980s up until early 2000s.

Blog Post

Using these tools can be very effective as mentioned above when it comes to data collection as well as finding similarities in specific works. Text analysis tolls like Wordle make it easier to get important pieces of information for larger documents that would others wise be more difficult to read or understand for example; Shakespearean literature. However, there are some inconsistencies in using programs like Voyant and Wordle in that they do not always give an accurate portrayal of what is being said in the text. Although these are user friendly, they are limited in terms of word representation.

 

 

Textual analysis is the studying of what, when, and how often certain words appear in certain contexts and, through that, drawing some conclusions about the politics, culture, language, social norms of the period or many other topics. Before, the scope of textual analysis was limited due to how long it would take for a person to conduct it, but now, using computers, textual analysis can a scope and scale beyond anything ever fathomed before. Tools such as the Google Ngram Viewer and Mining the Dispatch take advantage of such abilities and allow us to explore history in a unique way, but are they really all that useful? Let’s explore them to find out.

Mining the Dispatch takes digitized versions of the Richmond Daily Dispatch, a daily paper published in Richmond during the American Civil War and uses Topic Modeling to try and separate the various articles into topics by detecting what words are used in each. By dividing the articles into topics, you can then graph how common each topic was over the time for the entire run of the paper. The problem with how Mining the Dispatch is set up is that, as stated in the intro, the software dictated what topics were used, instead of the historian, causing for some fairly broad topics. It was also fairly difficult to find the purpose of the project and what they were setting out to do. Graphs weren’t labeled very well either.

The other issue I had with it is that it often took a while to get more than a couple of duplicating entries for each topic. For example, the “Fugitive Slave Ads” page was entirely dominated by ads offering a $10 reward for returning a slave named Parthena and a $100 reward for returning a slave named Sam. The topic assignments were also fairly odd and inconsistent. For some reason,  each ad looking for Sam was given different topic assignments, despite their content being identical.

mtd

Why are the topics so different than the first one?

Why are the topics so different than the first one?

The Google Ngram Viewer shows how often a certain phrase appears in Google’s corpus of digitized books between a specified time period. It is pretty well laid out and easy to use, which was nice. To play around with it a little, I decided to see input “Germany, France, Britain, United States”. This is the result I got.

ngram

Now while this does give some interesting data, it is limited in what it can provide. The graph shows the rise in the influence of the United States over time and the falling influence of France and Britain. Both the US and Germany peak twice during the two World Wars, which would make sense. However, the problem with this data is similar to the problem with nearly all textual analysis tools. They don;t give any context. The data provided by the Ngram viewer does not give any context as to how, for example, “United States” or “Germany” are used in books. Are they pro-American, anti-American, pro-German? We have no context, and without any context, we are lacking a lot of out research.

The third thing I explored was the Science Magazine article “Quantitative Analysis of Culture Using Millions of Digitized Books”. This article brought up some great points on how textual analysis can be used to examine the evolution of language and grammar use, which would, indeed, be a great use for the technology. But, I did find a fair bit of their research be be either long-winded or simply difficult to follow.

Textual analysis can be a useful tool for historians and will probably become more so as the amount of digitized material grows. However, since it doesn’t give much context to work with, it is limited and would need to be supplemented by other forms of research.

From objects and word of mouth to artifacts and primary sources to web pages and online Journal articles. Should historians now really need to know how to program?

The manner in which gathering and distributing information has developed and evolved is remarkable, however, I do not think that Historians have to know how to program in order to effectively contribute in their field. Firstly, with all the programming we have done in class I found that I still don’t really understand how to program or see the point to it.  With all the time and work I put into putting up these words “Hello World” on a page on my computer, I could have been online looking up information for my final projects or fooling around with Neatline or one of the other programs that we have learned about to get a better understanding of them. Secondly, I found that being thrown into using Komodo Edit, as fast as we did, made it very difficult to understand what we were doing. After it was explained a couple of times I felt that I could somewhat understand what we were doing with the writing of the code. Having limited background in computer science I was not comfortable with programming in general and felt I would never understand it well enough to have it hold any value during my research of topics.  But I really felt that it would have served those of us in the class who have never written or seen code before better to have had a longer time in class of explaining and telling us what every thing stands for.  The code below is  from a class lesson that we had to write (you can find it here: http://programminghistorian.org/lessons/keywords-in-context-kwic)

# calculate the length of the n-gram
kwic = 'amongst them a black there was one'.split()
n = len(kwic)
print n
-> 7
# calculate the index position of the keyword
keyindex = n // 2
print keyindex
-> 3
# display the items before the keyword
print kwic[:keyindex]
-> ['amongst', 'them', 'a']
# display the keyword only
print kwic[keyindex]
-> black
# display the items after the keyword
print kwic[(keyindex+1):]
-> ['there', 'was', 'one']
Now I did do all the lessons before it such as when we had to write “Hello World” and have it come up three times on a page, or when we wrote code to get a word count from the Old Baily web page. I even did the code above in class but I got lots of help with every lesson including this one and still find I do not understand this code. I know we can have a course that takes an entire term to learn how to program and that programming is a valuable skill to have even for historians but I believe that three or four classes is far to little to have people who have never worked with it before to learn and pick it up. I would much rather continue working with and learning about the other, I believe, far more useful programs (that some one else created) like Neatline or sketch up. I could see more value in an interdisciplinary course where people choose to specialize in historical code learn and develop the programs for historians to utilize. Code is not for everyone but having historians who can use it and make websites/programs for other historians to use would make doing research a lot easier and help out the community of historians greatly.

By nature, humans are social beings. And an integral part of socializing is telling stories. For centuries, we have been using speech and oral history as the primary way of telling stories. This changed with the invention of the printing press and the standardization of languages. The printing press also made communication and education more accessible. This new technology caused us to change the methods in which we told stories to be able to interact with the new technology. Such is the case now with the advent of the World Wide Web. Websites, online databases, Facebook pages, and even blog posts are just adapted methods of storytelling.

Personally, I like the idea of citizen history and the idea of everyone being able to contribute to the preservation of history via these online databases. History is simply more than just what is written in textbooks by scholars, academics, and teaching ‘authorities’. As the name implies, history involves an exchange of stories. Archives such as the 9/11 archive allows every Tom, Dick or Harry to tell their stories. I also believe that these databases are valuable not only for the contributors, but also for the readers. They provide a fairly accurate description of the events that is far more relatable to the common person than say a textbook.

I also found it interesting that the databases covered events that occurred within the past decade. When most people think of history, usually there is a time span of about 20 years before an event can be considered historic. However, I believe that his method of recording recent events is important as it can capture an event more accurately and provide more in-depth insight to an event.

Interestingly, it is these databases that have inspired my final project for this course. After reading the databases, and thinking about the importance of citizen histories, I’ve decided to create a database dedicated to the volunteers from the Vancouver 2010 Games. The games were an important event in Canadian Society, and I believe, is cause for preservation. I know that there are just hundreds, if not thousands of amazing stories that are just waiting to be told, and hopefully I can compile these stories and do their storytellers justice.

Howdy everyone. Unfortunately, due to compiling technical difficulties, I haven’t been able to post my blot posts over the past couple of weeks which means that I’m going to be posting entries from weeks ago. Mea Culpa :s

Having grown up with a parent who is a web designer and a software engineer, gave me an early look at programming languages such as Python, Ruby, C++ and Java. Unfortunately I was never really interested in programming, it seems very tedious and frustrating. I remember taking an Econometrics course in my third year; we had an assignment where I needed to learn how to use software called STATA. It is not very complicated, and probably couldn’t event be considered “true programming” but it uses its own tiny language to produce output of a regression. I remember how frustrating it was doing that assignment, being stuck on one tiny line of code and having no idea how to fix it. For this reason, I always shied away from attempting any sort of coding.

Fast forward to this semester, I have to learn a coding language in my history class. What?! Why?! God dammit… I wont lie, I was disappointed that I had to go back to the late nights of pulling my hair out and having a pot of coffee, trying to figure out the code. But I was in for a nice surprise. As far as programming languages go, Python has an element of elegance to it. Its unbelievably simple and could make even the most daunting of tasks appear much simpler.

www.http://programminghistorian.org is an incredible open access online textbook that makes an already simple language even easier to learn. It looks at Python through a lens of a historian and brings together to world of programmers and historians into one. But why? I’m sure I wasnt the only one to internally exclaim “Why do I need to know this for history”? Well, to answer that question, consider a small inside joke programmers have “If you are going to do it more than three times, write a program for it”. Historians undertake repetitive tasks almost on a daily basis. Of course, it would be a mistake to say historians do the same thing day in and day out, but in many cases, especially research, the tasks preformed are at least some what similar. That is where Python comes in.

The beauty of programming, is that you can adapt and modify and existing program to do something else. For example, after working through Programming Historian lesson about multiple records and query strings, we should end up with something that looks like this:

python

Now we have a program that will search the archive and return the values we look for. But suppose a day after we want to search for something else. Do we need to write another program? Nope. Just replace the arguments and the URL (if needed) and run the program.

Python was created in 1991, in a way, its a piece of history itself. One of the first programming languages back when computers were still an uncommon household object; and yet it it still useful, relevant and simple. After having worked through some of the lessons, I am convinced that Python definitely has merit as a tool for historians. In an increasingly digitized world, historians soon will have no choice but to adapt. Especially with collections becoming more and more digitized. As the world and technology changes, looking through books and making trips around the world to check out a particular piece of parchment or sort through entire volumes of records is becoming a little bit absurd. Those opposing change need to ask themselves “What is there to lose”? We should not allow stubbornness to stand in a way of innovation. Even if innovation means using a 21 year old piece of code. Its better late than never.

 

Trying to decide what to write in this post was very difficult. Not because I don’t feel adept talking about programming but because I find that I am in a much different position in terms of my programming expertise. I have had experience, in some capacity, programming in C, Java, Scheme, HTML and Python. I have been programming since I was 13 and began seriously programming at the age of 16. So as opposed to learning how to program for the first time, the programming historian was a tool to relearn an old and proficient skill of mine.

This was my first reaction to python.

Learning to program is a terrifying, frustrating and wonderful adventure, all at the same time. It requires a great deal of determination, intellect and above all patience. It will be the cause of moments of great joy and great anger. There are going to be times where you’re going to want to light your computer on fire and then throw it through the window but, in the end, it is all worth it.

After reading through and experimenting with the different lessons in http://programminghistorian.org/ I found several things that I liked and things I didn’t like. I thought that for a new programmer it gave useful lessons for things that historians would likely need to do. Showing how to scrape information from a webpage for instance and how to manipulate strings in order to get useful pieces of information from large quantities of data is great. However, I find sometimes they show things without truly explaining them. List comprehensions are a fairly advanced concept (at least that’s what I feel) and the site sort of brushes over them fairly quickly without explaining them even though they go on to use them quite frequently. Obviously in the interest of being concise, some things must be skipped but I feel if you’re going to bring up a concept, you must explain it in a little detail. I acknowledge though that this might just be nitpicking because I am more advanced than the targeted audience.

I need to try PERL too.

I was able to take a lot away from the website personally though. It allowed me to quickly learn the basics of python and build programs that will aid my research ventures. I wasn’t a huge fan of python at first. It sort of freaked me out because I am so used to needing to write complicated statements and in a way I still love Java because it is a little more structured but python is awesome. It is so simple to use and is perfect to use when writing small programs in a short amount of time. It is a great language to get your feet wet if you’re trying programming for the first time. It has all the necessary functions that you might need and it is extremely forgiving for mistakes.

As for the bigger question of whether programming is something that historians should be involved with, I could not agree more. In 20 years, the majority of the primary sources for the year 2013 will be digital. The photo below is an instagram from NBC showing how the world has changed in 8 years as seen in St. Basilica Square.

We live in a digital world. I honestly believe that everyone should have some experience with programming. It annoys me that seemingly only students in the sciences and mathematics have any sort of required programming course. I think that programming courses should be mandatory for all programs. This may be an ambitious and extreme idea now but in twenty years or so I think we will be much closer to this than people think. Historians are going to need to know how to take their ideas and put them online, otherwise our field will always be at the mercy of others.

It was nice to get back into programming as before the start of this year, I hadn’t programmed in almost 2 years. It’s like a terrible drug in a way. There might not be a better rush than solving a tough problem with some savvy coding and no greater low than not being able to figure out why you’re codes not working after six hours of staring at brackets.

EDIT: This is awesome

Also, this is awesome