Hacker Timesnew | past | comments | ask | show | jobs | submitlogin
Small C Projects (stackoverflow.com)
116 points by mvid on March 27, 2011 | hide | past | favorite | 28 comments


It really surprises me when people look for "small" programming projects.

Boot a Unix, leave X, sit still for 20 minutes. There.

Get the sources to your Unix and just go read the code for your everyday utils. This is best done with a BSD, that isn't bloated with GNUisms.

If you don't believe me about GNU bloat, just look here:

http://www.freebsd.org/cgi/cvsweb.cgi/src/bin/

Get a BSD and devour the beauty that is Unix, unfuckedwith.

FreeBSD also comes with all the papers & research docs you need; love how the troff formatting is readable in console with zcat. The CRT radiation kept me glowing green for many an enjoyable night.


If you want minimalism, you could study the BusyBox sources as well. E.g. BusyBox coreutils:

http://git.busybox.net/busybox/tree/coreutils

Most BusyBox utilities leave out more options than FreeBSD equivalents.


I almost always prefer the GNU utils to the BSD ones. The GNU ones usually allow arguments in nicer orders. GNU `sort -h` can sort the human-sized output of GNU `du -h`. I've noticed tons of niceties from these sorts of tools on Ubuntu are completely missing from BSD based OSX.

One man's bloat is another man's convenience.


The GNU toolset may have combobulated source, but it's a damn sight more usable and useful than bare-bones utilities.


But the metric of this thread was "small C projects", not useful, cushy coreutils.


> bloat

This word, which I see so often, is almost entirely content-free. At most, it conveys a vague sense of 'big is bad'; the association of 'bigness' is the only thing that saves it from being a purely content-free snarl word.

http://rationalwiki.org/wiki/Loaded_language#Snarl_words

("When used as a snarl words, these words are essentially meaningless; most of them can be used with meaning, but rarely are.")

To say what ought to be obvious, one person's bloat is another person's essential feature. And, yes, I do use that paragon of GNU software, GNU Emacs, and surely you realize the folly of trying to tell me my editor of choice is bloated.


Bloat is real, and it affects one in five American teenagers, and 100% of GNU code. Bloat can be defined as `cat` being 7 pages in BSD, and 20 in GNU. It's an aesthetic value judgment, made by someone qualified to say .. "do not want":

http://www.freebsd.org/cgi/cvsweb.cgi/~checkout~/src/bin/cat...

http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/cat....


Plan 9's cat is written in a mere 35 lines: http://plan9.bell-labs.com/sources/plan9/sys/src/cmd/cat.c

As always, if you want to read C code as written by the same people who invented C, the Plan 9 source code (http://plan9.bell-labs.com/sources/plan9/sys/src/) is a great resource.


Then what? spend an eternity in heart-break hotel with LispM & BeOS fans? ;-)


Nah, come join the fun w/ the Plan 9 crew at http://golang.org!


314 vs. 784 lines, not adjusted for amount of comments. Bloat or simply an increased set of features? Being used to GNU commands, I've always found BSD commands lacking features I often use. Although you can usually compose the same operation using some additional pipes and although I understand how this would be considered 'prettier', 'cleaner' or 'better', I still feel the GNU commands I am used to are just right and do not constitute 'bloat'.


In many cases the "GNU bloat" lies not in the added features but in unnecessary cleverness and complexity. See how GNU cat does it's own complex buffering, while FreeBSD version just calls read/write in loop in simple case and uses stdio in the complex one.


From a quick skim of the two sources, the difference in sizes can be attributed to:

1) The GNU version has MUCH more verbose commenting 2) The GNU version reads input in blocks in cooked mode, rather than a character at a time as the BSD cat does. This is MUCH faster, but it leads to a lot more complexity, and thus more code. It also carefully calculates the optimal block size to use for this; this is also quite complex and carefully commented (see, eg, the 20 line comment at line 736). 3) The GNU version has to be portable to multiple unixes, and thus has a number of places where it has to test for multiple error codes, and/or missing features 4) The GNU version has a very verbose --help output, which consumes a good page or so of code by itself.

I don't really see any 'bloat' there. Sure, it's okay to have a simple cat, but it's not a bad idea to optimize a tool that's used so frequently. And the verbose --help output, verbose commenting, and portability are all part of the GNU coding standards. You can argue about whether you want to spend all that effort on it, but I don't think the sheer volume of code is a good measure for whether the code is good or not.


Seven? That is at least around 85% bloat. http://minnie.tuhs.org/cgi-bin/utree.pl?file=3BSD/usr/src/cm... needed only one page.

As the poster above said, one man's bloat is another man's essential feature.

Back to the subject: reading those old sources really learns you why, back in the seventies, people found Unix so appealing. even ignoring the feature growth/creep (or whatever you want to call it), you do not have to wade through a zillion copyright header lines, option parsing that goes on for ages, locale-specific stuff, etc, before getting to the meat of the program. Disadvantage is that some code dives into assembler fairly quickly (for example, printf is mostly assembly in the system I refer to above)


early C source code kept its figure by foregoing any stupid bounds checking.


The immediate difference I noticed is that the GNU version contains a lot more comments describing what actually happens.

Now, what do you think is more appropriate for educational purposes. Source code with plenty of comments or source code with nearly no comments at all?


If you just wanna see how a few library functions and system calls are used? yeah, just get APUE and dig into the simplest implementations you can find, which is often BSDish.

I routinely gloss over comments when reading code anyway; the most accurate documentation is found via reflection & introspection on the system itself, not comments.


It's a little different from what the OP was asking for, but my favorite toy programming project: write a shell. This brings you in contact with many of the basic aspects of the Unix system: pipes, fork, exec, environment variables. In around a thousand lines, you can implement a shell with pipes, backgrounding, backquote, variable expansion, conditionals, looping, and lots and lots of bugs.

In general, I like the idea of trying to implement absolutely minimal versions of common programs. Other possibilities are: an HTTP server or proxy, a Lisp interpreter.


To make a Lisp interpreter you need to know Lisp. But I liked the idea of writing a shell.


Writing Lisp interpreter is actually good way to learn most of core Lisp concepts.


Buy an embedded system like the Arduino and learn C while haking a bit of hardware too. Yes, the Arduino is not 100% ANSI C but I suspect it's close enough.

For me learning a new programming language is always about finding a problem to solve, something that will keep me interested and make it fun to learn.


One of the good suggestions was exercises from K&R. I also like "Practical C" by Steve Oualline.

Then the best thing is to just work on something you like.

If you like networking, write an RPC server using the basic Linux socket interface or use 0mq to write a pub / sub :

http://www.zeromq.org/intro:read-the-manual

If you like audio or DSP write a an audio processing program and adds an echo or other effect to an input audio stream. You can try portaudio interface:

http://www.portaudio.com/trac/browser/portaudio/trunk/test/p...

If you like file systems & linux write a FUSE file system. Or a kernel module:

http://lwn.net/Articles/68106/

If you like graphics you can try libSDL:

http://friedspace.com/SDLTest.c


An XML parser makes a nice medium-sized project; it'll get you experience with C data structures and string handling. Plus it has the nice property that you can start small and build on, until you've got the whole spec done, and every bit along the way is useful.


These questions always seem so ridiculous. There are thousands of examples of projects that fall into systems programming.

Do one of these things: keyboard/mouse driver, hard drive diagnostics poller, implement raw sockets on windows, system call hooking pattern, a VM for a scripting language, a VM/crypter software protection scheme, etc.

There's all kinds of stuff.


> These questions always seem so ridiculous

Maybe I am misreading this. But I don't quite see why. The poster seemed to want to relearn C and was asking other people's opinions on what would be interesting small projects to start out with.


There is no reason to ask such a question because potential projects are outlined in _thousands_ of places across the net. It seems that he put absolutely no effort into googling and finding such projects on his own.


Write a Brainfuck interpreter. Seriously, it is trivial to write a simple one, slightly harder to write a compact one, and then just hack on to optimize it for speed. There are great test cases too including ASCII Mandelbrot renderer :)


For the 'more distinct parts of the C language', I would suggest reading a couple of the obfuscated C contest entries, than writing one of your own.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: