Rebelsky - CS2 Data Structures (Proposal)
A number of us are talking about ways to incorporate open source earlier in the curriculum, in courses that are already packed. In particular, many of us were considering what I refer to as CS2 - Data Structures and Algorithms, with perhaps a bit of Object-Oriented Programming and Abstract Data Types mixed in. That class is typically full enough that it is difficult to add something significant to it. And, as Cam said, the students aren't advanced enough to make deep contributions to an open-source project.
So ... I've been thinking about ways to incorporate an HFOSS project while still meeting the main goals of the course. One natural approach to me seems to be to take advantage of Ushahidi's data feeds. It should be pretty straightforward to build an infrastructure that lets students work with a local Ushahidi install, grab data from the install, and process it in different ways. I'll start by providing them with a wrapper class that handles communication with the server and parsing data into a collection of objects. Later in the semester (probably when we get to trees), I'll have them parse the raw JSON or XML data.
The initial focus for this wrapper class will be a standalone Java application that students may run through the terminal or Eclipse. However, I also plan to explore ways in which we might extend the Ushahidi Android App to provide opportunities to use the same API to achieve goals within that app. I've assigned three summer students to explore this opportunity.
Meeting Course Goals
In terms of the core algorithms and data structure issues, sorting and searching are two obvious approaches: "Write a program that presents the data in sorted order by ___. Generalize your core algorithm so that someone else can use your code to sort in different ways"; "Find all entries that meet these criteria. Generalize your code."
Ushahidi permits administrators to both validate and approve incidents. I can see having students explore relationships between the core linear structures (queue, stack, and priority queue) through a simple interface for approving incidents. That is, as data comes in, what order do you want to process it in and how do you store the data to make it easiest to process it in that order? Sometimes the newest request is the most important. Sometimes the first. And sometimes you have another kind of priority. This example also provides an opportunity to talk about some simple good designs. Ideally, the "validate" client can let us plug in whichever linear structure will best serve our policy. I see the API that they eventually program to as supporting two basic methods:
Meeting HFOSS Goals
Many of the HFOSS Community Activities that we've talked about don't quite seem appropriate for this kind of course. Students in the course generally aren't advanced enough to contribute code patches to the project. In addition, they don't really have time to do projects, such as writing documentation,that are tangential to the main learning goals of the course. So, how do they contribute beyond just using the data feed?
I hope to incorporate HFOSS in their work in two ways. For the first half of the semester, as they learn the core topics of the course, they will do homework assignments that involve gathering data from an Ushahidi install. I hope to set up an Ushahidi installation that logs incidents that they may find of interest on campus. Right now, my plan is to focus on campus outreach to the community - each time students volunteers in town, they can log their activities. I'll be meeting with Grinnell's director of Service Learning and Engagement to consider this approach. Alternately, we can track some kind of events on campus. In that case, I'll need to talk to our Office of Student Affairs to see what they think would be useful and acceptable.
Toward the second half of the semester, I hope to have students work with a client (an office on campus, someone in the community, an alum who is willing to be or to simulate a real client) to build a small Ushahidi application. To prepare students work with clients, I will probably have them follow appropriate aspects of the Ushahidi community. (After all, they'll need to have examples to show clients the kinds of things that they can build.)
So, how does this contribute back to Ushahidi? I think that this approach of *build things using an HFOSS application provides many of the things we want for students and for the community. They'll need to be a bit involved with the community to understand the uses of Ushahidi (and perhaps to set up their own application). They'll be pushing it in certain ways, which may provide opportunities for submitting feature and/or bug requests. (Perhaps they will even be inspired to learn enough to make changes, although I doubt that they'll have the time.)
In addition, if the research students who are extending the Ushahidi App to support the class develop extensions that will be useful to the broader Ushahidi community, they will contribute those back to the community.
I expect to write between six and nine "exercises". Some exercises will be designed to be short in-class lab activities. Others will be week-long programming homeworks. At the end of the semester, students will do a longer project that involves developing an Ushahidi installation. A list of potential exercises follows. (As I write each one, I'll make a link to a corresponding page. I've also started a Template for writing the exercises.
- Working with github (~1 hour)
- A short introduction to using github. This is somewhat secondary to the main project, but I do expect them to use github for the course. I'll be putting any "course code" on github and will ask them to submit via github.
- Using Ushahidi (~1 hour)
- Introduces students to Ushahidi and to the Ushahidi installation that we have set up for the course.
- Working with a simplified Ushahidi API (~1 hour)
- Introduces students to the simple Wrapper API that we've designed for communication with Ushahidi. Students will print out one or two incidents and perhaps approve an incident.
- Building the Ushahidi Andoid App (~2 hours)
- Walks students through the process of downloading the Android SDK and the Ushahidi code, building, and installing on an Android device.
- Setting up an Ushahidi Server(~2 hours)
- Walks students through the process of setting up a server. (Hopefully, a simplified version of what the POSSE Ushahidi team did.)
- Simple Linear Search (~1 hour)
- Using the simplified API, print out all entries that match some criterion. (E.g., all unverified reports more than a week old.)
- Binary Search (~4 hours)
- Given an id and an array of
UshahidiIncidents that are sorted by id, return the
UshahidiIncidentwith that id. Determine the number of comparisons that this took.
- Given a date and an array of
UshahidiIncidents that are sorted by date, return the last incident before the given date. Determine the number of comparisons required.
- Given a starting and ending date and an array of
UshahidiIncidents that are sorted by date, return the number of incidents that occur between those two dates.
- Testing Sorting Algorithms (~3 hours)
- Write a thorough set of unit tests for the method
sort(UshahidiIncident, Comparator<UshahidiIncident>). Submit your tests back to our common code repository.
- Selection Sort (~3 hours)
- Implement the
sortmethod using the selection sort algorithm.
- Quicksort (~3 hours)
- Implement the
sortmethod using the Quicksort algorithm.
- Heapsort (~5 hours)
- Implement the
sortmethod using the Heapsort algorithm.
- Generalizing Linear Search (~ 4 hours)
- Write and test a utility function that takes a stream of
UshahidiIncidents and a
Predicate<UshahidiIncident>as parameters and returns a list of
UshahidiIncidents for which the predicate holds.
- Stacks and Queues [~4 hours]
- The application "Approval" provides a simple interface which allows administrators to see and approve/reject/skip incidents one by one. Basically, incoming events trigger calls to the
addIncidentmethod and the UI does a call to the
nextIncidenteach time the administrator wants to consider another incident. But what happens when the incidents arrive faster than the administrator can approve the data? The application needs a policy to indicate which incident should be considered next.
nextIncidentso that the policy is "first in, first out". That is, the incidents are processed in the order in which they arrive.
nextIncidentso that the policy is "last in, first out". That is, the most recently added incident that has not yet been processed is processed next.
- Priority Queues [~3 hours]
- Of course, we may have other ways of prioritizing incidents. For example, some categories may have higher priority than others. Add a
setIncidentOrder(Comparator<UshahidiIncident>method and then implement
nextIncidentto support that order.
- Parsing and Trees [~4 hours]
- You may have been wondering how we get the data from the Ushahidi server. Ushahidi provides a Web API that returns data in JSON format. Write a program that reads in this format and prints it out nicely indented.
Summary: Meet one of the identified clients. Tell them about the capabilities of Ushahidi. Identify a project that they like that you think would be able to complete and that draws upon what you have learned this semester. Build the project. Present it to the class in a lightning presentation. 'Blog about it.
I'd like to make sure that the assignments are exciting ... not just "search for X" but "here's a reason you want to search for X". That will take some consideration.
I need to figure out the best way to show the results of these applications. Will they be interactive? Will they be command-line tools? Will they be Web-based? Will they be applets? Will they take advantage of a modified Android App?
- Learn more about Ushahidi and ways that it is used.
- Lurk in Ushahidi development community.
- Meet with Director of Service Learning and with Student Affairs to discuss possible projects.
- Develop local deployment that students can use in the course. (Requires picking a topic and gathering data.)
- Develop appropriate library to ease student use of the Ushahidi data feed.
- Develop assessment instruments.
- Obtain IRB approval for using assessment instruments.
- Experiment with extending the Ushahidi Android App to see if it is amenable to course use.
- Design assignments.
- Write assignments and supporting materials (e.g., instructions, code, unit tests).
- Test assignments.
- Post materials to foss2serve or teachingopensource
We only offer one section of this course each semester, so a direct course-to-course comparison is not possible.
Certainly, component of the assessment would be standard course goals pertaining to algorithms, data structures, and object-oriented design. My colleague who taught the previous section of the course did not return finals and kept those finals. Hence, it will be possible to reuse some questions from previous offerings of the course and to compare student answers.
From my perspective, the HFOSS component helps students build important software engineering skills, such as using provided libraries and working with clients. I don't think these aspects were as important to my colleague. Hence, I envision assessing these issues with a Likert-style confidence survey, with questions like: "I can use library code given just the documentation."
Attitudinal issues are also important. I expect to address these with Likert-style questions, such as "Computer software can improve the world." and "I can write software that makes a difference in people's lives."
I think it's important to look at general learning and attitudinal outcomes (not just CS, but things you might hope for in any course, such as understanding of how disciplines interoperate or building student confidence). I appreciate the general, course- and discipline-independent "Research in the Integrated Science Curriculum" (RISC) instrument, which includes a pre/post survey.
My colleague Henry Walker has had some success in having students fill out a short questionnaire after each new set of labs or assignments. I expect to adapt that to these assignments, although I will need to consider the appropriate granularity for such feedback (i.e., I may want to group some assignments together).
Grinnell has blanket IRB approval for using the RISC instrument. I will seek IRB approval for the additional instruments once they are designed.
I am also happy to provide a short experience report on each activity. (Okay, given how much I write, it's not clear that the reports will be short, but I'll try.)
I will ask students for their permission to share selected homework assignments with the POSSE/foss2serve community.
The RISC survey is relatively comprehensive. The pre-test includes
- demographic info (sex, ethnicity, status, major);
- post-graduate goals;
- expected course elements (e.g., "Working on a scripted lab or problem in which the students know the expected outcome", "Working on problems that have no clear solution", "Connecting your personal experience to the course problem or problems", "Writing a research proposal");
- opinions about self and science (e.g., "I wish science instructors would just tell us what we need to know so we can learn it", "Explaining science ideas to others has helped me understand the ideas better");
- self description on various scales that seem to assess Myers-Briggs personality types (e.g., "I would describe myself as reflective" to "I would describe myself as action oriented", "I would describe myself as evaluative and logical" to "I would describe myself as receptive and accepting"); and
- relative ranking on a variety of issues (e.g., "Creativity", "Leadership", "Participating in Class Discussions".
The post-test includes
- demographic info (sex, ethnicity, status, major);
- post-graduate goals;
- influence of course on post-graduate goals";
- learning gains on course elements;
- benefits (e.g., "Tolerance for obstacles faced in the research process", "Understanding of how scientists work on real problems");
- overall evaluation (e.g., "This course was a good way of learning about the subject matter"); and
- opinions about self and science".
RISC is used in a wide variety of courses at a wide variety of institutions. It has also been validated by comparative studies using interviews. Hence, it is quite useful for assessing general course outcomes.
I'm teaching our CS2 course this fall, so I'd expect to do development in the summer and pilot the approach in the fall. And, is the norm, I expect that as development progresses, some of my initial plans will go in new directions. (E.g., I may find that the linear structures assignment is cumbersome and that I want to add something more on tree traversal.)
I am fortunate to have a group of three summer research students available to work on parts of this project. Each completed the course this past spring. They are interested in helping build the wrapper library and to work on making the Android App more adaptable to this course. They are also willing to provide student-centric critiques of the assignments. And I think I can convince them to write some of the unit tests for the various assignments.
Deliverables are subject to change as the project develops.
- A Java library to make it easier for students to access Ushahidi data
- A set of about eight assignments that leverage Ushahidi to help students learn concepts in Data Structures, Algorithms, and Object-Oriented Design.
- A group of about three assignments to introduce Ushahidi and tools related to HFOSS.
- A project assignment in which students build an Ushahidi Installation for a client.
- Experience reports on those assignments and project.
- (Possibly, depending on time and students) An extended Ushahidi Android App amenable to classroom use.
Samuel A. Rebelsky
Department of Computer Science, Grinnell College
1116 8th Avenue
Grinnell, IA 50112
Email address via reCAPTCHA™ Mailhide