Working on YOURLS 1.6 : the next DB structure

What’s up, gents? I’ve begun working on the next iteration and here’s what I’m up to.

YOURLS 1.5 : DB suckage

As everybody knows, the current database design in YOURLS is dumb and very inefficient. Its biggest flaw is that the keyword (ie short url) is repeated in both the URL and the LOG tables, making it absurdly difficult to update a short url without losing all historical data.

Another design decision I regret was to store stuff in case they would be useful to anyone, such as user agents in the log table. Since there’s no core feature using that info and I probably won’t implement any, this should not be there. There’s a plugin API for this.

There are a couple of awesome features I want to work on for YOURLS 1.6, one of them being the ability to store arbitrary data associated to any short URL — think url meta data, the way WordPress does it.

YOURLS 1.6 : previous thoughts

I’ve been pondering about the next DB schema for a very long time now, trying to think about the most state-of-the-art structure I could come up with (given my overall blatant lack of skill for DB things) and addressing all possible future features and current issues. I’ve proposed stuff and a few people have been kind enough to comment. But all this had one weakness : it was complicated and scaring me. Seriously :)

YOURLS 1.6 : smarter yet simple (at least I hope so)

I decided to make up my mind once for good and here is how I see things in YOURLS 1.6:

The main URL table will be properly normalized with a URL id to stop repeating the keyword across tables. I could probably go further in normalization (storing the long URL and title somewhere else, as I first thought) but that’s where I think it becomes too much trouble and hassle for too few benefits.

The URL_META table will store anything you’d like about a particular short URL and will be used by plugins: some tags, a note, a mime type to handle redirection differently, anything.

The LOG table will be trimmed down a bit, which — disk wise — should be beneficial to sites with lots of hits, and properly. Again here I could probably go further and normalize the referrer information, but what bothers me then is the number of DB queries needed for each short URL redirection, which I want to keep at a very minimum.

The LOG_META table, just as its meta sibling, will store anything you’ll want to store about a hit: the user agent, some cookie info, anything.

No big change in the OPTIONS table, just an autoload parameter so plugins will be able to store anything without loading that every time in RAM.

So, that’s it. If you have any thought or any “zomg dude don’t, terrible decision” warning to share, please do. We’ll see later for other DB novelties such as log archives or further optimization. I’d rather stop pondering and start coding :)

Short URL to this post: http://yourls.org/6g

24 thoughts on “Working on YOURLS 1.6 : the next DB structure

  1. Hi, OZH. I was recently watching for the requests for plugins, to see if I could create one or two, and itwas missing your URL_META table; it will be a great improvement.

  2. Cool! As a first time user of Yourls, would you recommend I wait for 1.6, or is the release still a long way off.

    Also, is there a way to autogenerate a string that’s not sequential?

  3. Thanks! I grabbed the nightly build. Found the plug-in, and I’m up-and-running. Much appreciated, it’s working splendidly!

  4. I’m using this for a book project to provide links for resources and further reading. A couple years ago I took a stab (with limited success) at modifying the code to allow >1 url/title per keyword. This allowed me to provide a single link like temp.url/key, which would take the user to a page that listed all the urls/titles associated with that key (e.g. more than one online resource for a given footnote).

    I could do this easier with the url_meta table, but this type of forwarding suggests a distinction between keyword and url entities: The url table would become a keyword table with url and title removed, and a new url table would contain just the url_id (PK), key_id, url, and title. This would make it more flexible but require an extra join or query — and it could also be entirely out of scope because it’s a bit more than just URL shortening. :/

    In any case, url_meta could work fine so long as url_id and url_metatype don’t have to be unique, but it would be a bit of a kludge to get a list of both urls and titles and retain the proper relationship and order between the two.

    Have you (or would you) consider separating keyword and url?

    • url_meta.url_id and url_meta.urlmeta_type pairs won’t be unique, you will be able to store anything

      As you say, I think having one keyword to relate to several urls & title is just out of the scope. Actually I even wonder why you elected to try this with YOURLS in the first place :)

      I didn’t think long and may not totally grasp your needs but I think I’d go for another approach: a page at http://sho.rt/list-blah would display a page of links & title (it’s easy to mimic pages in YOURLS, see http://yourls.org/examplepage that’s generated from the file in the /pages/ directory), and there would be a plugin that, everytime you want to shorten a link with keyword “blah”, adds some HTML (that link and title) list-blah

      This way: easy to add new links (just shorten a URL), easy to maintain link lists (just edit a file if need be), and portable (would work on any YOURLS version)

      • I tried YOURLS mainly because of its awesomeness (reporting, UI, etc), but also because I couldn’t find another solution. Static pages kind of bug me, but I’ll take a look at that pages approach — that’s a clever idea. Thanks for your quick reply.

      • Actually, with a url_id and no unique constraint on keyword, I could just dump multiple links for the same keyword into the url table. If there’s only one record returned for the keyword, then simply redirect, but if there’s >1 then show a list. This could allow easy editing through the UI, too — and no static pages.

        To clarify my need: Say you’re reading a book that makes reference to the Schmeling/Sharkey heavyweight fight in 1930. The footnote describes it briefly, and provides a single short link like sho.rt/ss. The reader types in that link and gets a list of resources: wikipedia entries for both Schmeling and Sharkey, the original newspaper report, and also a video of the fight from archive.org — all in one place.

        • There WILL be a unique constraint on keyword. It’s a key principle of the application and, your very specific need apart, I don’t see a valid reason to remove that constraint but I see a lot of potential problems

          • This seems like it would be a pretty handy feature. I’ve seen a few url shorteners that do just this. I know bit.ly allows you to create groups of links like this, so maybe check with them.

      • My reply didn’t seem to take yesterday, so I’ll try once more:

        That’s too bad. I hope you’ll reconsider at some point. After some quick digging yesterday I was surprised to find that it’s actually pretty darn common nowadays — even bitly now supports “bundles.”

        If anyone else out there needs multi-URL shortening specifically, I’ve “bundled” a list of the first ten or so canned services that I found here: http://bit.ly/RPwXdg — I haven’t found any decent scripts yet, though.

        Ozh, other than my specific need, this kind of feature would also be useful for things like tweeting sets of links, compiling disparate product pages as a “wish list” or BOM, or (apparently) even posting a list of multi-URL shortening services to a blog without getting flagged as spam. :) (Hopefully it’ll go through this time.)

        • Bitly does exactly the same: there are shorturl, there are uniques, AND there is another thing, bundles. The idea of making shorturls non unique to either treat them as shorturls or bundles is completely clumsy.

          A bundle feature in YOURLS is totally possible with a plugin, either using the future urlmeta stuff, or already with the above suggested solution using pages.

          Shorturls WILL be unique.

  5. Hi, I’d store the ip address as ‘int’ rather than ‘varchar’ with the mysql’s built-in inet_aton function.

    • Actually, your comment got me thinking. I don’t need to store an IP in the log table. I just need to store the associated country.

      In the URL table though, I need it. But I think inet_aton doesn’t work well with IPv6?

Comments are closed.