Posts Tagged hgweb

v0.7Release - Retrieving the Total Number of Entries

Earlier today I outlined the goals of my current release. One of the problems I was looking to fix was with the function, getMaxEntries(). The job of this function was to create an xmlHttpRequest() in order to retrieve the maximum number of push entries in the repository. The pushlog displays data in reverse chronological order and thus I need to know the maximum number of entries so that I know which entries to display first to maintain the same order.

For example in my test repository I have 2613 total entries. The first 10 entries are displayed by default so when the user scrolls down to load more data the very first entry shown would be #2603, then 2602 and so on. Previously, to retrieve the maximum number of entries I was using the following function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
function getMaxEntries() {
  var entries = new XMLHttpRequest();
  var max = 0;
  entries.open('GET', '/json-pushes?startID=0&endID=1', true);
  entries.onreadystatechange = function() {
  if(entries.readyState == 4)  {
    if(entries.status != 404) {
      var entryData = JSON.parse(entries.responseText);
      max = entryData[1].max;
 
      start = max - 10;
    }
  } else 
      return 0;
  } 
  entries.send(null);
}

I didn’t really like this method of using an xmlHttpRequest() just to retrieve one value. I had to call this function onPageLoad() to calculate the maximum number of entries. This had to be done before my OnScroll function was called or various horrible errors would occur.

I’ve noticed something very unique about JavaScript. It doesn’t wait for one function to finish executing before going on to the next function. If your first function is taking too long, JavaScript will move on to your next function. Now, this feature was causing me quite a few headaches. As I said before, getMaxEntries() had to finish executing completely before the OnScroll function was called, otherwise things would go horribly wrong. getMaxEntries() sets the value of start, which is then used by loadEntries() to retrieve data (called OnScroll). Since getMaxEntries() was taking too long to execute, JavaScript was moving on to loadEntries() without setting the value of start, completely wrecking my logic.

Also, getMaxEntries() was causing Firefox to freeze for 10-15secs for reasons unknown to me. This way of calculating the max number of entries wasn’t very elegant and totally unacceptable. I needed to come up with a better solution.

The Solution

hgweb uses a template system called genshi. Basically, the maximum number of entries is calculated via a database query on the server side. In order to pass this server side variable to the client side I had to add the line 4 to pushloghtml() in hgpoller\pushlog-feed.py:

1
2
3
4
5
6
7
8
9
10
    return tmpl('pushlog',
                changenav=changenav(),
                rev=0,
                max=query.totalentries,
                entries=lambda **x: changelist(limit=0,**x),
                latestentry=lambda **x: changelist(limit=1,**x),
                startdate='startdate' in req.form and req.form['startdate'][0] or '1 week ago',
                enddate='enddate' in req.form and req.form['enddate'][0] or 'now',
                querydescription=query.description(),
                archives=web.archivelist("tip"))

Now, this change ^ allowed me access to this variable on the client side. All I had to do was to use this format: {<var_name>} or #<var_name>#. However, one draw back as far as I know is that I can only access this varialble in HTML, not in JavaScript, which is a significant drawback.

So in order to retrieve this data I added an id attribute the following div tag in hg_templates\gitweb_mozilla\pushlog.tmpl:

1
<div id="#max#" class="page_header">

Now, I could set the value of start to the maximum number of entries:

1
start = $("div").attr("id");

Thus, I can easily pass the correct value of start to loadEntries(), which makes an xmlHttpRequest() to retrieve the correctly ordered data when the user scrolls down.

, , , , , ,

No Comments

v0.7 Release Goals

It’s time for another release for hgweb. I had thought about working on a new feature this release by I’ve decided to push that back to the next release. This time around I completely want to focus on one bug, getting pushloghtml to show more than 10 entries at a time.

Why? Well, I’ve put out multiple patches for this bug and it’s still not perfect. I realize that in software development bugs will always pop up somewhere but I’m just not satisfied with where my solution for this bug is right now. I really want to push for a complete solution this time around. The following are my goals for this release:

  • The function which retrieves the total amount of entries in the database, getMaxEntries() is causing the browser to freeze. Find a solution for this problem
  • Initially only 10 entries are displayed which means that the scroll bar doesn’t show up and thus more entries can’t be loaded since the OnScroll event doesn’t get called. Solve this problem by dynamically loading enough entries, according to the users screen size, until the scroll bar appears. I’ve tried various solutions for this problem, none of which have worked well so far. I want to solve this once and for all
  • I’ve discovered a weird bug (I seem to have a weird talent of discovering obscure bugs) when displaying merge changesets. Some of them are not being displayed at all. Find a solution for this bug
  • I’ve noticed another possible bug with merge changesets where, sometimes the last entry in a merge changeset is repeated in the next entry. Now, is this is a bug or not? Note sure. I need to investigate

, ,

No Comments

v0.6 Release - Removing the Page Navigation Links from Pushloghtml

I’ve decided to remove the navigation links that appear at the top and the bottom of pushloghtml. This wasn’t part of my goals for this release but I’ve decided to add this at the last minute. The patch is for bug 459727, which loads more data OnScroll and thus makes the navigation links obsolete. There is no need for them to be there anymore.

 

Removing them was pretty simple. All I had to do was remove the following code from hg_templates/gitweb_mozilla/pushlog.tmpl

Page #changenav%navpushentry#

I had to remove the above line of code from two places to get rid of both the top and bottom navigation links. Taking these navigation links out removes the unneeded clutter from the page.

, , ,

No Comments

v0.6 Release - Refactoring to Fix the Bitrotting Issue with Bug 459727

I had mentioned in my previous blog post that hgpoller/pushlog-feed.py had bitrotted. One of my goals for this release was to make changes to the current version of pushlog-feed.py so that my patch is no longer broken for bug 459727. I’ve finally made those changes, which mainly occur in pushes_worker(). The following is what this method looks like with my changes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
def pushes_worker(query, repo):
    """Given a PushlogQuery, return a data structure mapping push IDs
    to a map of data about the push."""
    pushes = {}
    for id, user, date, node in query.entries:
        mergeData = []
        ctx = repo.changectx(node)
        if len(ctx.parents()) > 1:
          for cs in ctx.parents():
            mergeData.append(hex(cs.node()) + '|-|' + clean(person(cs.user())) + '|-|' + clean(cs.description()))
        if id in pushes:
            # we get the pushes in reverse order
            pushes[id]['changesets'].insert(0, node)
            pushes[id]['mergeData'].append(mergeData)
        else:
            pushes[id] = {'user': user,
                          'date': date,
                          'changesets': [node],
                          'formattedDate': util.datestr(localdate(date)),
                          'individualChangeset': hex(ctx.node()),
                          'author': clean(person(ctx.user())),
                          'desc': clean(ctx.description()),
                          'mergeData': mergeData,
                          'max': gettotalpushlogentries(conn)
                          }
    return pushes

Basically I had to pass in repo (web.repo) so that I could have access to repo.changectx(node). This now allows me access to ctx.parents() which I need to retrieve merge changeset data. I also went through the whole file and changed every instance where pushes_worker() was called so that repo was being passed in as a paramater along with query.

These are all the changes I needed to make to the server side code. Now I’ll have to examine the changes that were made to the client side which caused my patch to bitrot.

, , , , ,

No Comments

v0.6 Release - Examining the Changes with hgpoller/pushlog-feed.py

I had mentioned in my v0.6 goals blog post that my patch for bug 459727 had bitrotted. Unfortunately significant changes were made to hgpoller which apparently broke my patch. I need to remedy this situation because all my server side functionality for this patch is in the pushlog-feed.py file.

I downloaded the latest hgpoller source code and had a look at the changes that had been made. The file has changed in quite a few places. It seems that the function that my patch alters has been changed as well. I’m talking about pushes_worker(), which is repsonsible for passing the data from the server side to the client side.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
def pushes_worker(query):
    """Given a PushlogQuery, return a data structure mapping push IDs
    to a map of data about the push."""
    pushes = {}
    for id, user, date, node in query.entries:
		if id in pushes:
			# we get the pushes in reverse order
			pushes[id]['changesets'].insert(0, node)
		else:
			pushes[id] = {'user': user,
						  'date': date,
						  'changesets': [node]
					     }
	return pushes

Now, the problem is that I need access to web.repo within pushes_worker() so that I can call repo.changectx(node) but right now, I don’t have access to repo within the method. I’ll have to figure out a way to do that somehow.

, , ,

No Comments

v0.6 Release - Fixing Annotate for the Paper Theme

In my last release I had put out an hg annotate fix for the gitweb_mozilla theme. Dirkjan Ochtman, a Mercurial Project developer, noticed my release and asked me to come up with a similar fix for the Mercurial Project’s paper theme. I decided to take up the task and see if I could get similar results for the paper theme as I did with gitweb_mozilla.

The paper theme uses a HUGE table to display the results just like gitweb_mozilla. I tested the current version of the theme using a large 10,000 line cpp file. It gave me a loading time of ~30sec, which is ~10sec longer than gitweb_mozilla.

The following is the code to fix annotate:

map

annotateline = '<div class="l#parity#"><div class="codeauthor"><a href="{url}annotate/{node|short}/{file|urlescape}{sessionvars%urlparameter}#{targetline}" title="{node|short}: {desc|escape|firstline}">{author|user}@{rev}</a></div><a class="codeline" href="#{lineid}" id="{lineid}">{linenumber}</a>{line|escape}</div>'

style-paper.css

div.codeauthor { 
    display:inline-block; 
	width:16ch; 
	text-align: right; 
	color:#999999; 
	text-decoration:none;
	margin-right: 25em;
} 
a.codeline { 
    color:#999999; 
	text-decoration:none; 
	margin:0 10px; 
}                                   
div.l0 {
    background-color:#f6f6f0;
}                                                                    
div.l0, div.l1 { 
    display:block; 
}
pre.completecodeline { 
    font-size: 90%;
	line-height:1.4em; 
	font-family: monospace;
    white-space: pre;
}
div.headrev {
    float: left;
	margin-right: 32em;
	margin-left: 8em;
	font-size: 90%;
	font-weight: bolder;
}
div.headline {
	font-weight: bolder;
	font-size: 90%;
}

I changed the code (see above) and tested annotate again to examine the difference in loading time. I got some surprising results. The reduction in file size was not significant at all. The fix for gitweb_mozilla brought down the file size by 25%. However, in this case the reduction in file size wasn’t nearly as significant. Also the reduction in loading time for gitweb_mozilla was ~15sec but for the paper theme the loading time was only reduced by ~12secs. Currently, on my machine the loading time has gone down from ~30secs to ~18secs.

The speed increase isn’t as signficant for the paper theme as it was for gitweb_mozilla. Why is that? I don’t exactly know. Obviously there are other factors coming into play that aren’t allowing a similar speed boost. Nonetheless, there is a noticeble increase in loading times.

, , , ,

No Comments

v0.6 Release - Minor fix for bug 445560

All the way back in November 2008 I had put out a patch to implement expand and collapse functionality (for merge changesets) for the pushlog. I had gotten an r+ review but I just needed to make some minor adjustments to the patch. Time went on and I totally forgot about implementing that minor fix. I was looking through some of my previous work and then I realized that I had forgotten about this.

The change is pretty simple but important. I just have to use proper naming conventions for variable names, which I wasn’t doing before. This might be the type of thing that is easily overlooked, I was guilty of that myself but it is important that we abide by naming conventions. Why? So that the code is readable when other people inevitably come along to change/read it.

In hgpoller/pushlogfeed.py I changed “Id” to “id” as all other variable names are non-capitalized:

 entry = {"author": ctx.user(),
       "desc": ctx.description(),
       "files": web.listfilediffs(tmpl, ctx.files(), n),
       "rev": ctx.rev(),
       "node": hex(n),
       "tags": nodetagsdict(web.repo, n),
       "branches": nodebranchdict(web.repo, ctx),
       "inbranch": nodeinbranch(web.repo, ctx),
       "hidden": "",
       "push": [],
       "mergerollup": [],
       "id": id
       }

In hg_templates/pushlog.tmpl I changed id to pushid to make it more clearer:

var pushid = $(this).attr("class");
pushid = '.' + pushid.substring(11, pushid.length);
$(pushid).nextAll(pushid).toggle();

, , , , ,

No Comments

v0.6 Release Goals

It’s time to start working on my 3rd release for this semester. My goal is to put out 3 patches this time around:

These 3 patches should combine to make a good solid release. These are 3 seperate improvements to hgweb that I’m sure users must be looking forward to having. I’m not quite sure what changes have been made to pushlogfeed.py causing it to bitrot. The solution to the problem may be simple or it may be complicated. I was under the impression that no changes needed to be made but new code has been added causing my code to break. Another exciting prospect is that I’ll have my name added to the Mercurial Project once I implement the annotate fix for the paper theme making my work available to all hgweb users.

I’ll be putting all the details of my work on the project page and this blog. Time to get to work!

, ,

No Comments

v0.5 Release Complete

I’ve finally completed my 0.5 release. This release I tackled an interesting problem with hg annotate. Trying to improve efficiency and loading time of an application isn’t something I had tackled before. It was a good experience trying to figure out the solution to this problem. I’ll be putting up a new patch on the bug page very soon. The following are some of the important links for this release:

View the project page for more details.

EDIT: I’ve posted the patch for this release. Have a look…

, ,

No Comments

v0.5 Release - Running an Experiment

In my last post I revealed my fix for the hg annotate loading issues. My fix reduced the loading time to a relatively reasonable ~8sec considering the fact that currently the loading time is ~20sec. However, it was still bugging me that Mat’s implementation was producing faster loading times than my fix. His implementation is ~1sec faster.

Mat’s Implementation

The problem is that this implementation doesn’t use valid HTML. Mat uses two non-standard tags, x and l#parity#. Although, I must say that this is a very unorthodox and smart solution to this problem:

annotateline = '<l#parity#><x><a href="#url#diff/#node|short#/#file|urlescape#{sessionvars%urlparameter}">#author|obfuscate#@#rev#</a></x><a href="##lineid#" id="#lineid#">#linenumber#</a>#line|escape#</l#parity#>'

Altering Mat’s Code into Valid HTML

As an experiment I decided to test what would happen if I took Mat’s code and replaced the x and l#parity# tags with divs. Would the implemenation still remain as fast? Would it still be faster than mine? The following is the altered code:

annotateline = '<div class="l#parity#"><div class="codeauthor"><a href="#url#diff/#node|short#/#file|urlescape#{sessionvars%urlparameter}">#author|obfuscate#@#rev#</a></div><a class="codeline" href="##lineid#">#linenumber#</a>#line|escape#</div>'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
div.codeauthor { 
    display:inline-block; 
    width:16ch; 
    text-align: right; 
    color:#999999; 
    text-decoration:none;
    margin-right: 25em;
} 
a.codeline { 
    color:#999999; 
    text-decoration:none; 
    margin:0 10px; 
}                                   
div.l0 {
    background-color:#f6f6f0;
}                                                                    
div.l0, div.l1 { 
    display:block; 
}
pre.completecodeline { 
    font-size:12px; 
    line-height:1.4em; 
}
a.codeline:hover,
a.codeline:visited,
a.codeline:active {
    color: #880000;
}

Results

I tried testing the altered code and I was getting a loading time of ~7sec for this cpp file. However, I was never able to get into the ~5sec region which Mat’s original code was sometimes able to achieve. This altered version was only able to reach a minimum time of ~6sec.

The point is that this altered version is a bit faster than my fix but slower than Mat’s original implementation. I compared the file sizes of the 3 versions:

  • My fix: 2.4MB
  • Mat’s Fix:1.8MB
  • Altered version of Mat’s Fix: 2.2MB

It is interesting to note that using div tags instead of an x and l#parity# tag increases the file size by 18%. I don’t know why that happens but somehow the x and l#parity# tags are more efficient than div tags. Nonetheless the reasons don’t matter, the altered version of Mat’s fix seems to be the best solution to this problem at this time.

, , ,

2 Comments