Development


Development and Howto02 Mar 2008 03:25 pm

I have been a big supporter of Subversion for a long time. I have used it off and on for about 3 years now and have never really had very many troubles while using it. It has always suited my needs and actually allowed me to really learn what version control is, why it’s useful and how to use it. Until recently, I have been very satisfied with its capabilities.

A Distributed Problem

When I created the MyEPICS 2.0 application, I had a vision to make a general framework which can be used outside the context of a service learning management program. Well, fast forward about two years later and I feel I have a stellar framework. I have one tiny problem though, which is that I need to make modifications inside the framework’s directory structure, without changing the main subversion repository. In other words, I have a project that I want to base upon the MyEPICS Framework, but I don’t want to use the main Subversion repository for its version control because not everything I do is going to back into the framework.

An example directory structure is below:

/
-/ME
--Core.php
--...
-/Modules
--/<--- Here is where I want to change things

Purely Subversion Option

If I were to want to do something like this, I would have to check out each Module directory individually from my own repository, and manually do each svn command in each module directory to get up to speed. Not to mention, if I want to update it server side, I must update each directory individually.

On the other hand, I could just set up a new repository for my own work, and any changes outside of the Modules directory I could manually generate a diff, and apply it to the main repository and commit, and vice versa for getting changes from the main repository. It's tedious, but can work.

Mercurial managed

With using a mercurial managed repository, I can check out the repository to some directory, and make a mercurial repository on top of that. From there, I can hg clone the repository to do any of my own work. I can play around with the modules directory, do all of my own changes, without effecting the main Subversion repository. Any time I have a change to add to the main Subversion repository, I just svn commit them. If I have a change to pull from the main Subversion repository, I just
svn update
hg commit

And I have an updated repository! The only downside is that for some reason, any clone of my main mercurial repository doesn't allow me to do my Subversion commands, such as updating and committing. There are a few directories that Mercurial doesn't see in the .svn directory, which I will have to find out why at a later date.

Making the Switch

No Central Repository

Probably the biggest difference is that there is no "central" repository to work with. This means, that any changes you create can be isolated to your environment. You can totally fork an entire project and set up your own version control, and sync up with it whenever you want w/o them really knowing. In fact, that's the entire problem that I've had working with Subversion... I couldn't find a way to do this. Now, trying to argue that there isn't a "main" repository and you'll just be wasting energy. In every project, there has to be some "official" repository which contains the "official" code. In the Linux kernel, that's Linus' repository. For other projects, they may have an "official" repository which is delegated by a select group of people.

Ignoring Files

Mercurial uses a file named .hgignore to tell it to ignore files, as opposed to Subversion using a property on the file, namely svn:ignore. In this file, you can either add single files, or regular expressions. The man page on hgignore is helpful in this case.

Meta Repositories, Tagging, and Branching OH MY!

In the case of Subversion, it is recommended that each repository contain all the tags and branches in it's repository layout, and even allows for multiple repositories contained within one large repository, thus making a meta-repository.

In Mercurial, there is no repository layout. Your repository is just your project itself, so there is no trunk, branches or tag directory. In subversion, if you wanted to branch or tag, you just copy a directory in the repository to the branches or tags and work from there. In Mercurial if you wish to make a branch, you simply hg clone hg branch (thanks luke!) it. When you want your changes to be seen, you push it back or wait for someone to pull from you. If you want to tag something, it's accomplished via hg tag.

Helpful Hint

While I was reading the mercurial red-bean book, I came across such an excellent note, I'm going to repeat it verbatim:

Note: If you’re new to Mercurial, you should keep in mind a common “error”, which is to use the “hg pull” command without any options. By default, the “hg pull” command does not update the working directory, so you’ll bring new changesets into your repository, but the working directory will stay synced at the same changeset as before the pull. If you make some changes and commit afterwards, you’ll thus create a new head, because your working directory isn’t synced to whatever the current tip is.

I put the word “error” in quotes because all that you need to do to rectify this situation is “hg merge”, then “hg commit”. In other words, this almost never has negative consequences; it just surprises people. I’ll discuss other ways to avoid this behaviour, and why Mercurial behaves in this initially surprising way, later on.

Conclusion

After working with Mercurial for a few weeks, I have a little bit better of an understanding of how it works, and enjoy using much better than I did when working on the Rubix Cube project. For now, this will allow me to have my own version control while keeping in sync with the main Subversion repository.

Development07 Jan 2008 02:15 pm

Nathan has been working pretty hard on releasing version 0.3 of our Hanzi Recognizer program, and finally it has been released, which includes a small change in the GUI.

Over the past few days, word got out about our Hanzi Recognizer program, which has generated upwards of 300 downloads a day, with 1000 total downloads soon to be reached. I’m not quite sure where the popularity all of a sudden sprouted, but we had increased our downloads per day by two orders of magnitude in just a few days, AMAZING!

Development and School25 Nov 2007 03:04 am

Hanzi Recognizer is a project that has been born out of CS490t at Purdue University by myself and Nathan Hobbs. The information below is current as of 11/24/2007.

Features

When you draw a Chinese character in the drawing panel, and look it up, it gives 5 of the closest matches. For each character, it also informs the user of the definition, pronunciation, main radical and main radical definition.

hanzirecognizer.png

High Level Implementation Details

The stroke recognition and scoring algorithm isn’t the greatest, but it is based upon JavaDict, a 10 year old Java 1.2 program. A stroke is defined as pen down until pen up. What it does, is it turns a single stroke into a number. The number represents the direction the stroke was made. When you make a multiple direction stroke (Like drawing a capital L), it is registered as two directional strokes.

Directional strokes:
         7   8   9
          \  |  /
        4 ---5--- 6
          /  |  \
         1   2   3

So an L would be considered a 26 stroke. Currently then, it scores the given stroke against every stroke in the database, with the lowest score being the best match.

We then have a Unistrok file that contains how to draw every character. This file looks similar to:

6c34 | 61 2 1 3

Where it is [unicode | strokes].

To Do

Currently, we need a better scoring algorithm. If there are too many or too few strokes drawn, then we just arbitrarily give it a bad score. This can be improved through implementing an edit distance algorithm.

Allowing for characters to be searched by pronunciation or radical.

It takes a long time to load. This is a side effect of the fact that our characters are stored unsorted, therefore it takes O(n) time to search for a character. In itself that’s not slow, but doing that 50,000 times is slow. We need to sort the characters as we store them and search with a binary search (or hash and search in O(1)).

Nate just implemented some basic speech functionality. We need to get some speakers to speak the pronunciations and implement this.

Currently we are creating our own XML file format to store the characters so we don’t have to get information from 3 different databases. This file can be rebuilt as the others are updated, and itself would cut down on the load times by almost a full order of magnitude.

We need to add more support for more variants of radicals and simplified / traditional characters. Currently, there are many places where we (seemingly) arbitrarily choose to display either a traditional or simplified variant.

References

Unihan Database

CEDict

Development15 Nov 2007 12:21 am

Make sure in your prepared statement, you don’t put quotes around your question mark, it’ll save you some debugging!

Bad:

SELECT * FROM User WHERE name='?'

Good:

SELECT * FROM User WHERE name=?
EPICS28 Oct 2007 06:44 am

As I have talked about Ohloh.net earlier, I submitted the MyEPICS project to the site for fun. What I got back was that the MyEPICS 2 project was valued at over $1,000,000 dollars and would take approximately 19 person years to develop. Amazing!

Development and EPICS and Howto22 Oct 2007 08:07 pm

So I have been doing some more Gnary Queries for MyEPICS.

We have a table called TeamChoice that has 3 columns, userid, teamid, and choice. These represent a person’s 5 choices that they make when registering for EPICS. There is one special row for each person. If the person is not assigned to a team, they are given a choice=0 with their current team choice. So with 5 team choices, each user has 6 rows in the database.

The Scenario

What I wanted to do is find all the students who are currently in their first team’s bucket. So that means that for each user, I wanted to get the teamid where choice=0, and compare it to where choice=1. If they are the same teamid, I wanted to return the result. This probably could be accomplished by some script, but a database should be able to give me the results w/o any scripting.

The Method

The way to accomplish this is through subqueries. I realized that I am essentially querying against two tables, then doing a join on the userid. The first table all the records for students and their first choice, the second table is the records for students and their current choice. If I then join the two tables together on their userid, then I will have what their first choice is, and their current choice. All I need to do next is add a where clause that makes sure that I only get records with the two results are the same and it’s over.

The Query

SELECT User.username
FROM (
SELECT TeamChoice. *
FROM User
INNER JOIN TeamChoice ON User.id = TeamChoice.userid
WHERE TeamChoice.choice =1
) AS firstchoice
INNER JOIN (
SELECT TeamChoice. *
FROM User
INNER JOIN TeamChoice ON User.id = TeamChoice.userid
WHERE TeamChoice.choice =0
) AS currentchoice
ON currentchoice.userid = firstchoice.userid
INNER JOIN User ON User.id = currentchoice.userid
WHERE firstchoice.teamid = currentchoice.teamid
Development and Howto and School17 Oct 2007 01:40 pm

Today I finished up some of the coding for the my Rubix Cube project using Direct X and LINQ. Luke Hoersten and Nate Hobbs were my partners in developing the program. All of us worked on coming up with designing a sufficient data structure. I worked on implementing the data structure and the user interface / Direct X stuff. Luke focused on refactoring the code, and coming up with parts of the solving algorithm. Nate worked on parts of the solving algorithm.

Previous Posts

I have outlined some problems we had while developing the application and also some design decisions that were made to include LINQ. The way our program solves it is that it goes about it the way that a human would.

Solving the cube

The program starts with trying to solve the bottom layer, then solve the middle layer, then finally the top layer. Luke would be able to give a much better explanation of how it worked. It was suggested by our professor that we create a tree by attempting every move, then for every move try every move, etc… This we thought would not be a great way to solve it for two reasons. The first being that there are 18 possible moves to make on a cube, and if you want to try and go n levels deep, it would have to compute 18n states. Not only that, but it would have to store every state, score them, then take the path with the best score. To go 6 levels deep, you need to compute 34 million different states, for 7 levels deep, you need to compute 612 million states. Computers are good, but they aren’t that good.

This would not be completely terrible, if you had a smart heuristic for scoring. If every time you computed every one of those states, you were guaranteed to get closer to the solution, then you would only be spending time solving it, and time a computer has. The heuristic that was presented was to find proximity of where a block is to where it should be. This is not the greatest either because what happens when you are almost finished solving it, and you have only one or two cubes out of place (or oriented wrong). You have to end up messing up your cube really bad in order to solve it, and if you’re algorithm doesn’t go deep enough, you won’t find your solution because your current state will always be better than where you are going.

The Program

Basically, here is the outline of the program:

Block.cs – A tiny block within the rubix cube. The rubix cube contains 27 of these blocks.
Cube.cs – The cube structure, contains helper functions to orient and move sides of the cube.
Solver.cs – The class that solves the rubix cube.
UI.cs – Handles the DirectX and drawing of the rubix cube.

We haven’t formally licensed the program, but I’m sure Luke and Nate wouldn’t mind if we say it’s GPL’d. Basically give us credit for whatever you do with it and don’t sell it.

URL: Rubix Cube in C# using LINQ
Alternatively, you can also check out the Mercurial sources at: Luke’s Mercurial Sources

Development and PHP25 Sep 2007 12:00 pm

**I HAVE UPDATED THIS, PLEASE READ THE UPDATE**
I have recently been working with Zend_Pdf and it appears that their ability to put text on a screen is very lacking compared to FPDF. I have recently been diving into expanding the functionality and have provided the following class: Zend_Pdf_Cell. This will allow a developer to create a “cell” within a page, center the cell and align the text within the cell.

The following is a small sample code of how to use the cell:


	$pdf=new Zend_Pdf();
	$pdf->pages[] =new Zend_Pdf_Page(Zend_Pdf_Page::SIZE_A4);

        $font=Zend_Pdf_Font::fontWithName(Zend_Pdf_Font::FONT_TIMES_ITALIC);
        $pdf->pages[0]->setFont($font,12);  

	//create and attach the cell to the first page, and center in the X and Y direction
        $cell=new Zend_Pdf_Cell($pdf->pages[0],Zend_Pdf_Cell::POSITION_CENTER_X | Zend_Pdf_Cell::POSITION_CENTER_Y);
	//align the text in the center
        $cell->addText("The quick brown fox jumped over the lazy dog.",Zend_Pdf_Cell::ALIGN_CENTER);
        $cell->newLine();
	//change the font
        $cell->setFont(Zend_Pdf_Font::fontWithName(Zend_Pdf_Font::FONT_TIMES_BOLD),28);
	//align this to the right
        $cell->addText("The quick brown fox jumped over the lazy dog.",Zend_Pdf_Cell::ALIGN_RIGHT);
        $cell->newLine();
        $cell->setFont(Zend_Pdf_Font::fontWithName(Zend_Pdf_Font::FONT_TIMES),10);
	//align this to the left
        $cell->addText("The quick brown fox jumped over the lazy dog.",Zend_Pdf_Cell::ALIGN_LEFT);
        $cell->newLine();
        $cell->setFont(Zend_Pdf_Font::fontWithName(Zend_Pdf_Font::FONT_TIMES_ITALIC),28);
        $cell->addText("The quick brown fox jumped over the lazy dog.",Zend_Pdf_Cell::ALIGN_CENTER);
	//finally write the cell to the page.  Text will not show up unless you write to the page.
        $cell->write();
        //create a new cell and center on just the X coordinate
        $cell=new Zend_Pdf_Cell($pdf->pages[0],Zend_Pdf_Cell::POSITION_CENTER_X);
        $font=Zend_Pdf_Font::fontWithName(Zend_Pdf_Font::FONT_HELVETICA);
        $cell->setPosition(Zend_Pdf_Cell::POSITION_TOP);
        $cell->addText("The quick brown fox jumped over the lazy dog.");
        $cell->write();

        $cell=new Zend_Pdf_Cell($pdf->pages[0],Zend_Pdf_Cell::POSITION_RIGHT,$pdf->pages[0]->getWidth());
        $font=Zend_Pdf_Font::fontWithName(Zend_Pdf_Font::FONT_COURIER);
        $cell->setPosition(Zend_Pdf_Cell::POSITION_BOTTOM);
        $cell->addText("The quick brown fox jumped over the lazy dog.",Zend_Pdf_Cell::ALIGN_RIGHT);
        $cell->write();

	$pdf->save($this->uploadDir.'/TEST-CELL2.pdf');

I have attached not only the Cell patch, but also the diff of font resource files separately.

In the origional source for the font’s, we can see that they didn’t use the correct widths:

/* The glyph numbers assigned here are synthetic; they do not match the
*  actual glyph numbers used by the font. This is not a big deal though
*  since this data never makes it to the PDF file. It is only used
* internally for layout calculations.
*/

So when attempting to calculate things like a string width, you would get the wrong numbers. I looked up the numbers on Adobe’s Font and Type Technology Center particularly the Unix fonts. From there I generated a small PHP script that would output the ASCII value and the corresponding glyph width. Using these numbers, I updated the Zend font resource files which gives you the diff below. Because these fonts only contain width values for the ASCII numbers 32 – 251ish, those are all that were updated.

Zend Font Width Patch

Cell Patch

Development19 Sep 2007 02:09 pm

Over the first 4 weeks of my senior semester, I have been working on a Rubix Cube application. It started out difficult, such as how we were going to represent the data structure. Once that was decided (using a lot of LINQ for helper functions), I started to focus on parts of the UI. During these parts of the UI, I ran into a couple of pitfalls while using DirectX that held me back.

Culling

So if you make the standard hand with your left hand, look at what plane you are defining your points. If you are defining your points on the XY plane, then notice where your index finger is pointing. If you are not looking at your cube from that direction, your item will not show up because of culling. You can turn this off, but remember you will get a performance hit. I preferred to keep it off because if my graphics card cannot handle ~100 triangles w/o culling, I might as well get another computer or do the computations by hand!

Z Buffering

Contrary to what I believed, DirectX does not enable Z buffering by default. From reading my book I was able to read how to turn it on, but essentially this caused whatever I was drawing last to draw over my other points – even if they were hidden due to their Z layer.

Those two problems really caused me to get a cube that looked crazy and with both combined debugging was impossible because there was no intuitive reason why I would see my faces sometimes and wouldn’t on other times.

This next week or so, we are going to be working on an installer for the application that will allow somebody to install and run the program. I’m not sure how many people will be happy to have to install DirectX 9 and .Net 3.5 to run our little 500kb or so Rubix Cube software, but hey, it was a fun project!

Development and School29 Aug 2007 03:02 pm

In CS 490T I am working on a team with two others creating a Rubix Cube project, one being Luke. We have have been struggling to find a nice, clean, object oriented way to represent the cube, but I believe that we have found a good solution. We are going to have a Cube class with methods that rotate sides of the cube. This class will have one property, a collection of Blocks. A single block has six properties: front, left, right, bottom, top, back. The way we will rotate the block will be by specifying a face (top, left, etc…) and a direction, clockwise or counter clockwise. Class outline is below.

Class Cube {
 Block blocks[20];
 rotate(Face, Direction);
}

Class Block {
 Color top;
 Color bottom;
 Color left;
 Color right;
 Color front;
 Color back;
}

The way that you can represent the position of the block is which faces are defined. If a color is not present in the block, then the color will not be defined. Such as the top middle piece will only have the top property not set to null. This way, you can know which position a block is in by knowing which sides have colors defined. So the position of the block with the top, left and front defined is the block in the upper left corner closest to you.

 ______
|\\_\\_\\_\\
|\\|X| | |
|\\| | | |
 \\|_|_|_|

I belive that this Rubix cube has many valid uses for LINQ. One such way defined below.

We were going to do a lot of nasty if statements like

if (top!=null) { //some block on top
 if (front !=null) { //some front block
  if (left !=null) { //top front left
  } else { //top front right
  }
  etc...
}

But instead, with LINQ, we can do the following:

SELECT * FROM blocks WHERE top!=NULL AND front!=NULL AND left!=NULL;

And that will do all the work for us.

I was thinking about other ways to use LINQ, such as the actual updates when a move is done.

UPDATE blocks SET top=(SELECT color FROM blocks WHERE...) WHERE ...

This will update all the block colors for a given move. This may or may not be simpler than a lot of if/else if’s with manual updating.

« Previous PageNext Page »