Akom's Tech Ruminations

Various tech outbursts - code and solutions to practical problems

Simple flat file site search in PHP/Smarty

Posted by Admin • Wednesday, January 19. 2011 • Category: Linux

Sometimes using a real search implementation (Lucene, Sphinx) is just too much. The particular site I was working on is something like 30 pages, it's maintained as flat files (Smarty templates, but it's basically HTML on disk), and it really, really should not require megabytes of code and cron jobs to be able to search it!

That said, this is a simple search solution - it makes a lot of assumptions:


  1. Site is hosted on a linux/unix OS
  2. Files that correspond to pages are easily discerned from files that should not be searched
  3. You can exec a system command from PHP
  4. You can infer the name of the page to present in search results from the name of the file (or you don't care)


  1. Use grep
  2. No, seriously, use grep . Sound scary? Not if it's only a few dozen pages
  3. Combine grep -l calls to create "AND"-ed searches for all the keywords

So for a query like : 'tall tree' the resulting command would be something like:

find templates -name 'pageprefix_*.tpl' | xargs grep -il 'tall' | xargs grep -il 'tree'

The output of this command would simply be the list of files that contain both 'tall' and 'tree' (try it on the command line first). The function returns the clean bare page names (this is appropriate in my environment) so that the search page can present them the way it wishes.


(As my site is PHP + Smarty, my solution is implemented as a Smarty plugin - but you can take the code and do whatever you like)

  Smarty plugin
  Type:     function
Name:     filesearch - performs a file search among the public pages
       using filesystem grep



function smarty_function_filesearch($params, &$smarty) {
    if(checkInvalidValues($params, 'keywords')) {  //this is one of my helpers, this is up to you to do your way
        return "NO GOOD";
    $assign = $params['assign'];
    //sanitize incoming keywords for extra hacking safety
    $cleankeywordparam = str_replace(array('/','.','..','#','&','*', ';', '<', '>', '\'', '"', '}', '{', ']', '[', '$', '\\'), array(' '), $params['keywords']);
    $results = array();
    try {
        // This is where we hardcode the find expression for your pages, adjust as appropriate:
        $cmd = "
find templates -name 'pageprefix_*.tpl' | xargs grep -il  ";
        $keywords = explode(' ', $cleankeywordparam);
        foreach ($keywords as &$keyword) {
                $keyword = escapeshellarg($keyword);
        $cmd .= implode("
| xargs grep -il ", $keywords);
//      echo "
Will execute '$cmd'";
        $output = shell_exec($cmd);
        $results = explode("
\n", $output); //the results are just filenames, one per line
        foreach ($results as &$result) {
                $result = basename($result, '.tpl');
//              echo "
<br/>processing $result ";
                $temp = explode('', $result, 2);  //my files are named like pageprefix_pagename.tpl
                $result = trim($temp[1]); //don't want the pageprefix
        $results = array_filter($results, '_trimPages');  //remove empty entries
    } catch (Exception $e) {
        echo "
Search error : " + $e;
    if (checkInvalidValues($params, 'assign')) {
        return $results;
    } else {
        $smarty->assign_by_ref($assign, $results);

function _trimPages($value) {
        return isset($value) && strlen($value) > 0;

The actual Smarty template is:

        <form method="post">
                <input type="text" name="keywords" size="40"
                {if $smarty.post.keywords}value="{$smarty.post.keywords}"{/if}
                <input type="submit" value="Search"/>

        {if $smarty.post.keywords}
                {filesearch keywords=$smarty.post.keywords assign=searchresults}
                <h2>Search Results:</h2>
                {foreach from=$searchresults item=pagename}
This interpolates the filename back to a human-readable string *}
                        <li><a href="/{$pagename}">{$pagename|replace:'_':' '|replace:'-':' '|ucfirst}</a></li>

0 Trackbacks

  1. No Trackbacks


Display comments as (Linear | Threaded)
  1. No comments

Add Comment

Enclosing asterisks marks text as bold (*word*), underscore are made via _word_.
You can use [geshi lang=lang_name [,ln={y|n}]][/geshi] tags to embed source code snippets.
Standard emoticons like :-) and ;-) are converted to images.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.

Markdown format allowed

Submitted comments will be subject to moderation before being displayed.