Fram

From Fram

Revision as of 17:17, 17 January 2009 by Haakon (Talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

Fram is a Mediawiki extension or set of extensions to turn Mediawiki into a collaborative, literate programming platform for developing free software. The goal is to make it easier for developers, documenters, translators and artists to participate in free software projects, because the only tool you need to start is your browser.

You get all the benefits and familiarity of the platform that runs Wikipedia and thousands of other sites, like scalability, version control of all pages, discussion and member pages, anonymous edits or authenticated edits, access to a wealth of third-party extensions like syntax high-lighting and more.

At present, Fram lets you mix writing explanations and computer instructions - code - and then extract the code to a file you can download and run on your computer. This way of writing software is called literate programming. This page contains the actual Fram code we are running now. Fram already lets you write several different files in one article, but we need to make it easy to use extract an entire project spread across different pages into an archive file.

You may participate in improving the code, the explanation or help translate this page just by clicking the Edit-tab at the top of the page or edit one of the sections, hit the Save-button, and you are already participating! Don't worry, you can't really break anything. We just open a previous working version, save that, and everything is alright again - it's that easy!

"Fram" is a Norwegian word meaning "Forward" - safe journey!


Contents

The code - what we are running at present

At present, Fram is equal to RawFile version 0.2, so we are keeping that version number. You can of course just download Fram by following this link :-)

So let's explain a bit the code in a Literate Programming way...

Hooks

First some hooks for our functions...

We will create:


<?php
 
if (defined('MEDIAWIKI')) {
 
$wgExtensionFunctions[] = 'efFram_Setup';
$wgHooks['LanguageGetMagic'][]       = 'efFram_Magic';
$wgHooks['RawPageViewBeforeOutput'][] = 'fnFram_Strip';

Setup function

For the wiki parsing to create download links, file and fileLink are equally treated, while fileAnchor will be simply left out.

function efFram_Setup() {
    global $wgParser;
    $wgParser->setFunctionHook( 'file', 'efFram_Render' );
    $wgParser->setFunctionHook( 'filelink', 'efFram_Render' );
    $wgParser->setFunctionHook( 'fileanchor', 'efFram_Empty' );
}

Hook to initialize the magic words

We add the magic words here: the first array element indicates if it is case sensitive, in this case it is not case sensitive. We could add extra elements to create synonyms for our parser function.
Unless we return true, other parser functions extensions will not get loaded.

function efFram_Magic( &$magicWords, $langCode ) {
    $magicWords['file'] = array( 0, 'file' );
    $magicWords['filelink'] = array( 0, 'filelink' );
    $magicWords['fileanchor'] = array( 0, 'fileanchor' );
    return true;
}

Parser functions of the magic words

The transformation rule to replace link shortcuts to actual links for download
The input parameters are wikitext with templates expanded, the output should be wikitext too
TODO: what error to send out if there is no filename given?
TODO: supports links to files located in other local wiki pages, sth like 2nd arg default to $pagename='Fram'
EDIT: It seems that commit 27667 (1.11 -> 1.12) changed the default parser, which breaks the recursive parsing. Thanks to Tim Starling for helping me to get around the problem!

function efFram_Render( &$parser, $filename = '') {
    return $parser->mTitle->getFullURL( 'action=raw&file='.urlencode( $filename ) );
}

And the other one, just removing the anchors from the rendered wiki page.
Curiously enough if the function doesn't exist at all the effect is exactly the same, MW doesn't throw any error.
But let's keep things clean...

function efFram_Empty( &$parser, $filename = '') {
    return '';
}

Hook to intercept the raw output

This part of the code doesn't look that nice because we've to parse the raw wiki page ourselves to retrieve the code sections we want.

First let's see if ?action=raw was used in the context of this extension: in that case we receive the filename as GET parameter, otherwise we simply return from our extension with return value=true which means we authorize the raw display (originally the hook was created to add an authentication point)

function fnFram_Strip(&$rawPage, &$text) {
    if (!isset($_GET['file']))
        return true;
    $filename=$_GET['file'];

By default the downloadable file will still be handled by the ob_gzhandler session made by Mediawiki. To avoid output buffering and gzipping, one can uncomment the following line:

    // Uncomment the following line to avoid output buffering and gzipping:
    // wfResetOutputBuffers();

Raw action already set the headers with some client cache pragmas and is supposed to be displayed in the browser but in our case we want to make this "page" a downloadable file so we overwrite the headers which were defined and we add a few more, to ensure there is no caching on the client (it's very hard for the client to force a refresh on a file download, contrary to a web page) and to provide the adequate filename.


    header("Content-disposition: attachment;filename={$filename}");
    header("Content-type: application/octetstream"); 
    header("Content-Transfer-Encoding: binary"); 
    header("Expires: 0");
    header("Pragma: no-cache"); 
    header("Cache-Control: no-store");

Then we'll strip the output, first we've to locate the anchors but there are anchors that could be protected in literal blocks like nowiki.
So we'll mask the literal blocks before searching for the anchors (we mask with the same string length because we'll retrieve an offset that we will use on the initial string and offsets must match)
TODO: should we care also of source, js, css, pre,... blocks?

    $maskedtext=preg_replace_callback('/<nowiki>(.*?)<\/nowiki>/', 
        create_function(
           '$matches',
           'return ereg_replace(".","X",$matches[0]);'
        ),
        $text);

Now we can search for the anchors (or the short version, in which case we only keep the first hit, no multiple blocks support)
And we free the memory used for the masked version
TODO: instead of cowardly returning if we don't find our anchors, we should cancel the headers and return a proper error page

    if (preg_match_all('/{{#fileanchor: *'.$filename.' *}}/i', $maskedtext, $matches, PREG_OFFSET_CAPTURE))
        $offsets=$matches[0];
    else if (preg_match_all('/{{#file: *'.$filename.' *}}/i', $maskedtext, $matches, PREG_OFFSET_CAPTURE))
        $offsets=array($matches[0][0]);
    else
        // We didn't find our anchor, let's output all the raw...
        return true;
    unset($maskedtext);

$text is both input & output so we copy it and start with an empty output.

    $textorig=$text;
    $text='';

For each anchor found we've to isolate the content of the next block.

    foreach ($offsets as $offset) {

Let's remove the text up to the tag following the anchor
TODO: the next tag could be a < br >, which we should skip

        $out = substr($textorig, $offset[1]);
        $out = substr($out, strpos($out, '<'));

What type of tag do we have?
Note that we're looking to the word directly following '<' up to '>' or a space, e.g. if there are arguments to the tag.
TODO: once again, better handling of errors than just returning.

        if (!preg_match('/^<([^> ]+)/', $out, $matches))
            return true;
        $key = $matches[1];

OK, let's extract the text up to the closing tag
We skip the first carriage return after the opening tag, if any
We look for the closing tag and we take what's in between.
TODO: once again, better handling of errors than just returning.

        $begin = strpos($out, '>')+1;
        if (ord(substr($out,$begin,1))==10)
            $begin++;
        if (preg_match_all('/<\/'.$key.'>/', $out, $matches, PREG_OFFSET_CAPTURE))
            $text .= substr($out, $begin, $matches[0][0][1]-$begin);
        else
            // error, we could not find end of block
            $text .= substr($out, $begin);
    }


No need to deal with a Content-Length header because Mediawiki will do it for us, moreover more properly than we could if the output is sent gzipped, which is the default.
So that's it, $text contains our file!

    return true;
}

Credits

There is an official way to register the extension in a Mediawiki installation, so that it will be visible on the Special:Version page.
Let's say the extension is in the category of parser hooks even if there is also a hook on Raw action.

$wgExtensionCredits['parserhook'][] = array('name' => 'Fram',
                           'version' => '0.2',
                           'author' => 'Philippe Teuwen and Haakon Meland Eriksen',
                           'url' => 'http://www.mediawiki.org/wiki/Extension:Fram',
//                         'url' => 'http://www.far.no/fram/index.php?title=Fram',
                           'description' => 'Downloads a RAW copy of <nowiki><tag>data</tag></nowiki> in a file.<br>'.
                                            'Useful e.g. to download a script or a patch.<br>'.
                                            'It also allows what is called '.
                                            '[http://en.wikipedia.org/wiki/Literate_programming Literate Programming]. '.
                                            'Fram will support downloading of all project files from different article'.
                                            ' pages in a single archive file, e.g. a tar.gz or .zip file.');
}
 
?>

And finally registration of the extension at the Mediawiki website according to the Extensions Manual.

TODO: This extension has not its own page on the official Mediawiki site yet.

Installation

Download Fram.php and save it under the MediaWiki directory as extensions/Fram/Fram.php

Add at the end of LocalSettings.php:

require_once("$IP/extensions/Fram/Fram.php");

Status

If you use the extension properly the code is fully functional but it's rather raw on error handling.

ChangeLog

0.2

  • Fix problem with Content-Length mismatch when transport is gzipped (default for Mediawiki if client supports it)

0.1

  • Initial version

Questions and feedback

If you've any trouble, questions or suggestions, you can contact Phil or Haakon.

Personal tools