User:Scrain

From Foss2Serve
Jump to: navigation, search

Dr. Steven P. Crain

I am an Assistant Professor in Computer Science at the lovely SUNY Plattsburgh [1] nestled in the Champlain Valley at the base of the Adirondack Mountains. When I am not working with students, I am often seen hiking in the mountains or sliding silently along a ski trail. (OK, so really I am seen in a jumbled up mess at the bottom of a small hill....) I am an aspiring 46er [2] and have hiked 27 of the 46 high peaks in the Adirondacks so far.

My research area is machine learning, with a special interest in humanitarian applications. I often work with the Diabetes Hands Foundation [3] on search and recommendation projects. As a rule, I would rather focus on involving undergraduate students in my work than on churning out publications.

Contents

Notes on Participation in HFOSS

Hiking Computer Science Blog

IRC

How do people interact?
The interactions are informal, productive and collegial.
What is the pattern of communication?
The meeting focuses on one person at a time, but everyone chimes in with whatever will be helpful, either for solving a problem the focal person is facing or to build community. Normally the post is understood to be directed to the focal person, but a specific person can be addresses if they raise a point that warrants discussion.
Are there any terms that seem to have special meaning?
There are many common software development terms (e.g. VM, UML). Beyond that, "lab" may have special meaning.
Can you make any other observations?
This is definitely a productive working meeting. They tackle one issue at a time, and try to either work through whatever is blocking as a group or at least point the person in a productive direction. When issues are more work than they are worth, they table the issue and find more productive things to work on.
Summarize your observations of #mifos
I found the IRC channel easily using a web search for "mifos freenode." There are 1 or 2 people hanging out on the channel, but no discussion going on at present. The channel topic indicates that the channel is logged, but the link to the log goes to a non-responsive server. The Internet Archive [4] has record of an IRC session log from 2010, but the log itself did not get archived.
Summarize your observations of #openmrs
Given that #mifos was a flop, I also looked at #openmrs. This channel has a bunch of automatic updates whenever anyone starts working on a task or commits a change. Also, the result of automatic builds gets logged to the channel. People generally start conversations by saying "hi" to the person they need to talk to. In one conversation, someone was concerned that a design decision another made might not be wise long-term. He asked if the decision might be bad in the future. A collegial conversation proceeded in which they worked out the best plan and figured out who would implement it. Other conversations revolved around figuring our how to migrate features from one release to another and making arrangements to work together on particular tasks at a convenient mutually agreeable time.

Sugar Labs

What is unique about each Sugar Labs team?

Activity Team
The activities team is trying to foster development of shared activities. They have two coordinators and a large number of contributors listed on their contacts.
Development Team
The development team is responsible for the code and releases. They currently have a core of key people, but are looking for someone who can help coordinate them.
Documentation Team
The documentation team is producing documentation, including technical docs, videos and tutorials. Their contacts list two editors, one providing mentoring to new people and one helping to coordinate efforts.

Tracker

The tracker tracks enhancements, defects and tasks. For each issue, the tracker provides metadata (status, who opened it, when, who is watching it, which part of the project is effected, etc) a description of the issue, history of discussion about the issue and attachments that help explain it better.

Repository

Sugar labs uses the common repository hosted by Git at [5].

Release Cycles

Sugar labs is a large project with a lot going on. They organize the work on the many aspects around an explicit release cycle for the whole project. At the begining of each cycle, each team updates its roadmap to document what work they will do and what the relevant local mile stones and deadlines need to be to make the release cycle deadlines.

Sahana Eden

Community

This project is more organized in its approach in comparison with Sugar Labs. For example, developers are required to take some training and sign agreements before commencing. The public persona is also less personal, focusing more on the team identities than the individuals who lead them.

Developers
There are a lot of hoops to being a developer, including taking technical training and signing license agreements.
Testers
Less organized, you just have a manual to read through and follow the testing procedure. If you find any bugs, you submit them like ordinary bugs from anybody using the project. Not clear how to get good test coverage without knowing who is testing and what they are testing.
Designers
Potential designers are given guidelines on the ideal image of the sites, and can submit suggestions on how to improve the site to be more usable and attractive.

Tracker

How is the information here different than the information found on the Sugar Labs tracker page?
Instead of a form for customizing the list of issues to display, there are a set of prefabricated flavors of the report. The issues have less detail about how they relate to the project structure and do not have support for attachments. Each issue is assigned to a person who is responsible for the fix.
Click the Active Tickets link. Indicate the types/categories of tickets listed on this page as well as the information available for each ticket.
Types include defect/bug, documentation, enhancement, task. Each issue has a number, owner, description, status, time created, history, associated component, version, severity.

Repository

This is also hosted on Git, at [6].

Release Cycle

The release cycle is planned around completing certain selected functionality instead of planning for releases at certain times. The next release was planned to be done 3 years ago, but there is still work to go. The next cycle is planned out as well, and the following cycle is just sketched out.

Field Trip

Field Trip

Project Evaluation: Mifos

Mifos Evaluation

Getting Involved: OpenMRS

I had thought Mifos would be interesting, but I think OpenMRS will work better for me pedagogically. I am interested in:

  • incorporating some of my research results in personalized recommendation of resources specific to patients.
  • adding data mining integration to publicly available data, especially the FDA adverse drug events data. I am especially interested in this having some personal experience with the consequences of doctors not having a good way to study this data in the context of a specific patient.
  • testing and documentation.

Course Applications

For my software engineering course, I liked the Heidi Ellis's Case Study using a FOSS project as a running project in a software engineering course. I am thinking that OpenMRS could play an excellent role in this course, giving students a chance to: figure out what a component in a complex software project does and document it, with an audience of developers (e.g. UML diagramming) or users; perform tests; write automated tests. I might also be able to use it in the section on using revision control systems, but I would need to make a local repository they could mess around in.

For my data mining course, I generally give a large project that the class as a whole tackles. For this to work, I need something that can be broken into about 5 components that different groups of students can work on. I think that my idea of adding support for the adverse drug events data would work very well for that. One group of students can figure out how OpenMRS stores data, and how best the Adverse Event data should be incorporated into the existing database schema. Another group can work on the import itself, including parsing and cleaning the data and converting it into a format that integrates nicely with OpenMRS. Another group can work on extracting data about a specific patient for a personalized view of the adverse events. Another group can focus on mining the data given that other groups have imported it and provided an appropriate model of the target patient. The last group can manage the project and handle integration issues with the other teams.

I am also wondering about using OpenMRS a little in my computer security course. It could be a good example of an application with strict legal and ethical privacy and auditing constraints. I don't want to make that too strong a component of the course yet, because I am really hoping to not redesign this course quite yet....

Bug Trackers

Part 1

I have used many different issue trackers over the years, so most of this I just knew. To get the values in use I just sorted on a particular column. When I wasn't sure, I looked at an example ticket with the uncertain field value to see what was going on with it. For each field, I found a link to a page explaining the significance of the possible values.

The bug list is initially sorted by status.

Bugs that are trivial or enhancements are greyed out, since they are not really things needing attention in the same way as other bugs. Red is used to indicate blocking or critical issues.


  • ID: The identifier for a specific bug.
  • Sev: The severity
    • Blocking, this bug is so serious it is preventing important development activities.
    • Critical, essential for correct operation of the product
    • Major, a bug affecting core functionality
    • Normal, a bug affecting ordinary functionality
    • Minor, a bug affecting peripheral functionality
    • Trivial, a bug that does not really make an important difference, like a typo
    • Enhancement, proposed new functionality
  • Pri: The priority of the bug.
    • Low: Low priority
    • Normal: Normal priority
    • High: High priority
    • Urgent: Urgent priority
  • OS: Which operating system ports exhibit the bug. I suspect that more bugs are really ALL than are marked that way.
    • All: All operating systems exhibit the bug.
    • Windows
    • Mac
    • Solaris
    • Open Solaris
    • Linux
    • other: Some other operating system.
  • Product: which module contains the bug.
    • Aisle Riot
    • at-spi
    • atk
    • balsa
    • banshee
    • baobab
    • bjiben
    • caribou
    • cheese
    • clutter
    • clutter-gtk
    • conduit
    • congolmerate
    • devhelp
    • dia
    • doxygen
    • ekiga
    • empathy
    • epiphany
    • evince
    • evolution
    • f-spot
    • gconf editor
    • gdm
    • gedit
    • And many, many others!
  • Status: how the work on the bug is progressing
    • Needinfo: cannot fix this without more information
    • Reopened: we thought we had that fixed....
    • Assigned: somebody is working on it.
    • New: just been added, nobody has looked at it seriously yet.
    • Unconfirmed: Not sure what this means. Maybe somebody thinks they fixed it, and is waiting for independent testing? No... docs say that nobody has independently confirmed that the bug is real.
  • Resolution: Not showing up, because it is by default only showing unresolved bugs. In general, it is blank until somebody fixes it, then goes to RESOLVED.
  • Summary: A brief, user-created description of the issue.
Specific Bugs

I took a look at bug 381153, which relates to correctly reporting the reading direction of text that includes multiple languages, e.g. English reads left to right and Hebrew or Arabic read right to left. It is important that accessibility software know the proper sequence of characters.

  • Identify when the bug was submitted: 2006-12-01 05:34:21 UTC
  • Identify if there has been recent discussion about the bug: Yes, although this is an older bug, it was discussed just a month ago, noting that the solution is different now and the old proposed patch is obsolete.
  • Is the bug current? Yes.
  • Is the bug assigned? To whom? Technically yes, it is assigned to the mailing list gtk-bugs.
  • Describe what you would need to do to fix the bug.

In order to fix the bug, I would first need to install a gtk+ application and some accessibility software (probably a screen reader). This would allow me to verify that the bug exists and understand better exactly how it manifests. Next, I would need to examine the screen reader code to see how it is using the text direction attribute and examine the existing gtk+, gtkpango code and the patch that was submitted for consideration with the original bug report. Once I understood how the code works and how it is supposed to work, it will hopefully be relatively straight-forward to add correction for the text direction. I ahve worked with text direction with mixed English/Hebrew in other software before, so at least I have a basis of understanding the problem. Having fixed the problem, it appears that the best thing is to attach a new proposed patch to the bug. It looks like the people subscribed to the bug would be able to review the patch and get it included. Of course, I would need to double check the instructions for fixing bugs to make sure that is the correct procedure to follow.

I next took a look at Bug 347475, which prevents users of the Gnome On-Screen Keyboard (GOk) from recovering after they accidentally enter an invalid date. An error dialog pops up, but there is something preventing the GOk software from dismissing the dialog.

  • Identify when the bug was submitted: 2006-07-14 02:57:52 UTC
  • Identify if there has been recent discussion about the bug: There was some discussion up until 2 years ago, but nothing recently.
  • Is the bug current? Yes and no. There is dispute as to whether the bug is valid. The program apparently works correctly as specified, but it poses significant usability problems for end users. The usability problem is probably still present.
  • Is the bug assigned? To whom? No, it is marked UNCONFIRMED.
  • Describe what you would need to do to fix the bug.

The problem seems to be that sometimes error dialogs do a "mouse grab" which means that they force the user to only interact with the dialog until they have dealt with the error. This is evil from a general usability point of view. (What if I need to compare information in the dialog with something else it is covering? What if I want a screenshot of the error? The user should have control of the mouse, not an app.) but crippling for users who need to interact with an accessibility program that will interact with the dialog for them. I certainly won't convince too many programmers that they don't really need a mouse grab, so the trick is to find a way to get mouse grabs in general to play nicely with accessibility software. So, it turns out that this bug really has nothing to do with Evolution's calendar support and everything to do with Gnome architecture. That means that I need to first study the Gnome architecture, how mouse grabs fit into the general Gnome philosophy and architecture, and propose a modification to the architecture that would address the conflict. Two possible solutions, I would need to understand Gnome better to pick: A) The accessibility software could act like the pointer to all other applications, so that the mouse grab controls what the accessibility software can direct while leaving the user free to direct the accessibility software with the real pointer. B) The accessibility software could have an attribute that allows it to share the pointer, even when another apps does a mouse grab. The trick is to keep developers from abusing this so that mouse grabs become meaningless because everyone overrides them.

Once I had a proposed solution, I would need to find the communication channel where the core architects for the project hang out, in order to run the proposed solution by them. This isn't the kind of change somebody just makes and hopes people like it!

Part 2: Reports

How many bug reports were opened in the last week? How many were closed?
I looked at the summary report for last week. A total of 321 bugs were opened and 444 were closed.
What was the general trend last week? Were more bugs opened than closed or vice versa?
More were closed than opened.
Who were the top three bug closers? Why is this important to know?
These are the people to coordinate with if you want to fix a bug.
  • Bastien Nocera
  • Jean-François Fortin Tam
  • Emmanuele Bassi (:ebassi)
Who were the top three bug reporters? Are these the same as the top three bug closes? What is the overlap in these two lists?
Only one of the top three reporters is also a top closer. In fact, only Jean-François Fortin Tam and Sebastian Dröge (slomo) are in both lists at all. The top reporters are:
  • Jo
  • Michael Catanzaro
  • Jean-François Fortin Tam
Who are the top three contributors of patches?
  • Ray Strode [halfline]
  • Bastien Nocera
  • Cosimo Cecchi
Who are the top three reviewers of patches?
  • Sebastian Dröge (slomo)
  • Florian Müllner
  • Debarshi Ray
What is the overlap between these lists and the bug closers and bug reporters?
Bug reporter and patch contributor: Aurélien Zanelli, Garrett Regier
Closer and patch reviewer: Bastien Nocera, Florian Müllner, John Ralls, Milan Crha, Sebastian Dröge (slomo), Tim-Philipp Müller, Zeeshan Ali (Khattak)
Bug closer and patch contributor but not also reviewer (several had all three): Ondrej Holy
What is the overlap between patch contributors and patch reviewers?
Bastien Nocera, Jonas Danielsson, Sebastian Dröge (slomo)
Click on the “Generic Reports” link.
Plot the Severity of each Version of the Accessibility features of Empathy.

Report-2014-11-10.png

What other reports can you generate?

Just about anything you want. You can plot any number of features vs. another feature and restrict the data to include by quite flexible criteria. The output can be in several graphical and tabular formats.

Teaching Possibilities

Recalling your list of activities/topics from the "FOSS in Courses Planning 1" activity, identify the ways that these FOSS activities/topics can be structured.
For software engineering especially, aim for a stream of related activities, including lectures, in-class activities, homework assignments. Detailed:
Lectures, e.g. on FOSS contribution, Git and other FOSS tools.
In-class activities practicing being productively lost and figuring our how a system works.
In-class activity, looking at design documentation for OpenMRS and discssuing the architectural choices that were made, pros, cons, alternatives.
Homework, creating documentation for OpenMRS, including design diagrams, tutorials, general documentation of how to use a specific feature.
Homework, designing and conducting tests
Homework, fixing a bug
Project, large, whole-class project adding an enhancement to OpenMRS.
   List the revised activities on your wiki page. For each activity/topic:

Software Engineering

Identify some possible learning outcomes that should be fulfilled with the activities/task.
Learn to read and understand design documentation.
Create design documentation.
Learn/practice interviewing users to gather requirements. This can probably be done in collaboration with the hospital or the School of Nursing.
Learn to use version control systems effectively.
Learn to design tests for a specific requirement and to conduct tests.
Learn to use a bug tracking effectively, especially refining existing defects so they are easier to fix but also logging new defects.
Practice finding the source of a bug, fixing it and making a patch.
Describe any pre-requisite knowledge needed to complete the activity. This does not need to be a complete list.
This is tied into the general software engineering curriculum I normally teach, so for example I would be lecturing on the specification process in connection with assignments on interviewing users to gather requirements. The main additional content needed would be a lecture introducing the students to working on a small piece of a very large project, FOSS philosphy and work styles, Git for revision control, problem solving strategies for isolating, fixing, regression checking bugs.
Estimate the time required for instructor prep, for student completion and elapsed calendar time. Are you going to have to synchronize your activity with the community or can the activity/topic be covered independent of the HFOSS community schedule.
Instructor prep is probably several hours each week, especially selecting appropriate hints to get the students going in the right direction. Student effort is probably quite high: even though the assignments I envision would be similar to what I currently give, the extra complexity of working on a very large project will require substantial more startup time. Plus, the assignments will probably take longer to complete than the relatively simple analogs I have been using. I doubt my students will produce anything good enough to actually submit a patch, but they might well produce useful tutorials, bug reports, etc. This kind of contribution should not need any special coordination.
Think about possible input required from the HFOSS community. How much input is required and what kind?
None needed, although having an experienced FOSS contributor visit could be energizing.
If the result of the activity is contributed back to the HFOSS project, describe the contribution and its usefulness.
The main contribution would be tests, bug report maintenance, documentation and tutorials. Testing and working with the defect databases would have practical value to the developers. The documentation would likely be only a first pass, so I do not know if that is useful or not. Any tutorials are likely to help beginners.
Describe the assessment/grading approach - What will the basis for grading be? Will this be a team activity or individual? Is there a role for the HFOSS community in helping assess student work? For instance, must the work be committed or otherwise accepted by the community?
For this course, I require teams of 2 or rarely 3 students. I will be doing the assessment, and will base the grade on whether they complete the requirements of the assignments. Often this will mean it is not quite up to par for contributing back to the FOSS project, but may be a good start.
List any questions or concerns that you have about the activity/task.
I haven't taken a good look at the documentation for OpenMRS, so I am not sure whether there is plenty of work the students can do or if it is already in great shape. I am also concerned with whether I can select projects that do not require a complex setup on the student machines. I think if I avoid having them work on the OpenMRS core and instead work on some of the optional modules that it might be fine.
List any stumbling blocks or barriers to carrying out the activity/task.
Again, the code base is large, and it will be hard to get my students to successfully navigate it. Also install could be a concern.

Data Mining

Identify some possible learning outcomes that should be fulfilled with the activities/task.
Understand data preparation.
Learn to select and apply relevant machine learning algorithms for a data mining task.
Feature engineering: defining and extracting and using appropriate features.
Usability and user interface design for a data mining project.
Privacy aspects of personalized medical data mining tasks.
Ability to work with a HFOSS community to integrate enhancements cleanly, both from a technical and social point of view.
Practice data mining with large, real-world datasets.
Describe any pre-requisite knowledge needed to complete the activity. This does not need to be a complete list.
The students will need to understand the nature of medical records, the nature of adverse drug events and how they can be related. They also need to know how to select cases that are relevant to a particular user and how to model the selected cases in order to perform a personalized risk analysis from the data. HFOSS philosophy and practical workflow.
Estimate the time required for instructor prep, for student completion and elapsed calendar time. Are you going to have to synchronize your activity with the community or can the activity/topic be covered independent of the HFOSS community schedule.
This is intended to be the core of a course, with substantial prep time for the instructor, on the order of 5 weeks of preparation over the summer. That is OK since the course is next taught in the fall. The student effort is also substantial, on the order of 1,000 hours in aggregrate. We should be able to have the enhancement at the alpha level by the end of the semester. I would need to investigate if the proposed functionality is already present in OpenMRS, but I doubt that it is.
Think about possible input required from the HFOSS community. How much input is required and what kind?
We need to make sure that the project integrates properly with OpenMRS, which probably means publishing our plans to the OpenMRS architects pretty early on and getting feedback on the details. We also need to balance the needs of the class project with being open to contributions form outside contributors.
If the result of the activity is contributed back to the HFOSS project, describe the contribution and its usefulness.
We would be contributing a module that allows practitioners to identify records in the FDA's adverse drug event database that are similar to a specific patient, in order to better understand the risks of a drug to a specific patient. The data set is public, but access to it in a meaningful form is only currently available through expensive for-profit options as far as I know. This would be highly valuable for the project.
Describe the assessment/grading approach - What will the basis for grading be? Will this be a team activity or individual? Is there a role for the HFOSS community in helping assess student work? For instance, must the work be committed or otherwise accepted by the community?
For this course, I always split the work into about 5 parts and then have the students submit a resume and cover letter for which team they want to join. Since the project is really a research project, the grading is largely based on appropriate levels of effort and quality of work. I can use the revision control system to track the individual contributions for assessment. I generally also have the students provide feedback to other teams, and assess how well they respond to the feedback.
List any questions or concerns that you have about the activity/task.
None really.
List any stumbling blocks or barriers to carrying out the activity/task.
I think that OpenMRS comes with mock medical records that we can use for testing, but there could eventually be issues related to human subjects research and approval of the tool if it is perceived as a medical device.


Computer Security

Identify some possible learning outcomes that should be fulfilled with the activities/task.
Understand HIPAA and its implications for MRS.
Analyze the security of a complex system.
Develop security requirements for a complex system.
Describe any pre-requisite knowledge needed to complete the activity. This does not need to be a complete list.
The students will need an understanding of what a MRS is and the kinds of data it contains.
Estimate the time required for instructor prep, for student completion and elapsed calendar time. Are you going to have to synchronize your activity with the community or can the activity/topic be covered independent of the HFOSS community schedule.
Instructor prep time is a few hours for each of several lectures and assignments. The effort on these assignments for the students is about 10 hours per assignment. The activity does not require any coordination with the FOSS project.
Think about possible input required from the HFOSS community. How much input is required and what kind?
It would be great if the students could interview members of the HFOSS community who have a special interest in privacy and security.
If the result of the activity is contributed back to the HFOSS project, describe the contribution and its usefulness.
There may be contribution in the form of security-related bug reports the students identify.
Describe the assessment/grading approach - What will the basis for grading be? Will this be a team activity or individual? Is there a role for the HFOSS community in helping assess student work? For instance, must the work be committed or otherwise accepted by the community?
I will encourage the students to work in groups of 2-3 students, but most will opt to go it alone. I will use a traditional assessment based on the quality of their security analysis and their ability to coherently write about what they find or design.
List any questions or concerns that you have about the activity/task.
None
List any stumbling blocks or barriers to carrying out the activity/task.
None

About Working With OpenMRS

I am trying to get OpenMRS installed on my laptop, Windows 7, 64-bit. I already had JDK installed, but I needed to install Maven, Eclipse and GitHub. Actually, I am not sure if I really needed GitHub.

After making a fork of openmrs, I tried to install it according to the instructions, but 2 of the tests failed in openmrs-api. Who knows at this point—the error is probably somebody else's to worry about and not a show stopper for me. I found that "mvn install -fn" will go ahead and install even though there are some failures.

No good: it still refuses to install. I commented out File:Patch.txt to get it to install. I wonder if they work fine once you have an installed system, but not during the initial install?

Personal tools
Namespaces
Variants
Actions
Events
Learning Resources
HFOSS Projects
Evaluation
Navigation
Toolbox