And the last 3 primes under 1 billion are:
999999893 999999929 999999937
I didn't have as much stuff running, and actually had the 200 MB free to write out the list of primes to a file this time. There are over 50 million primes below 1 billion.  

 (Last time I had memory hog programs running and only 150 MB free disk space or so). The total running time for all primes below 1,000,000,000 was about 6 minutes. About 3 minutes to sieve all the numbers, and another 3 minutes waiting for windows to enlarge my page file and to write them all into an array before being stored to disk. I used up about 215 MB of RAM during the operation (and my system only has 192 MB of RAM). Note: The sieving process only took about 15 MB, but the list of prime numbers it generated took up the 200 MB. I used 1 bit per number in the sieve part since I'm only marking if the number is prime or composite. Hence using a bit vector for storage. The list of primes written to the file was a list of 32 ints however. It might have been better if I'd just stored the bit vector.  :unsure: 
Also note, if you've marked all multiples of all primes up to n, then the smallest composite number which has not yet been marked as composite (everything defaults to prime) is greater than n^2. (That is, it has at least 2 factors since it's composite, and each factor is greater than n, since otherwise it would have been marked as composite during a sieve of all multiples of a prime factor less than or equal to n). This is why the outer loop only has to iterate up to the square root of the max value for primes you're looking for. It's also why you can start sieving out composites starting at the square of a new prime that has been found. (Although not required for correctness, these two facts contribute greatly to the speed of the algorithm).
Oh, and to save memory during the sieve and a bit of running time, I special cased 2. That way I only have to check odd multiples of new primes. (Start at p^2, and increment by 2p for the inner loop). Note that if p is odd, then so it p^2, and so p^2 + 2p*n is also odd (since 2p is even). This is important for correctness since I didn't reserve storage space for multiples of 2 (it's clear 2 is the only even prime number) and the indexing doesn't distinguish between even and odd values. (That is, and index of 6 or 7 would both point to the same bit). You wouldn't want to accidentally mark 7 as composite by trying to mark 6 as composite. But this is just an optimization issue. It's only important because of my implementation to save memory (and a bit of time).
The code!
#include <memory.h>
#include <math.h>
#include <iostream.h>
#include <fstream.h>
#define ByteOfBit(i)	(list[(i) / (sizeof(int) * 8 * 2)])
#define Bit(i)  	( 1 << (((i)/2) & (sizeof(int)*8 - 1)) )
#define IsPrime(i)  ((ByteOfBit(i) & Bit(i)) == 0)
unsigned int* GenerateListOfPrimes(unsigned int max, unsigned int &numPrimes)
{
	unsigned int i, j;
	// Initialize the number of primes found
	numPrimes = 0;
	if (max < 2)
  return 0;    // We are done
	// Display status
	cerr << "Allocating " << ((max  / (sizeof(int)*8*2))+1)*32 << " bits (" << ((max  / (sizeof(int)*8*2))+1) << " bytes)" << endl;
	// Allocate space for a list of numbers to sieve
	unsigned int *list = new unsigned int[((max  / (sizeof(int)*8*2))+1)];
	// Check for memory allocation error
	if (list == 0)
	{
  cerr << "Out of memory (creating number list to sieve)" << endl;
  return 0;
	}
	// Display status
	cerr << "Initializing memory" << endl;
	// Initialize the list to all primes
	memset(list, 0, ((max  / (sizeof(int)*8*2))+1)*sizeof(int));
	unsigned int maxSieve = (int)sqrt(max);
	// Display status
	cerr << "Starting sieve for primes under " << max << " (" << maxSieve << " iterations required) ..." << endl;
	numPrimes = 1;    // We've found the prime 2
	// Sieve the list of numbers for primes
	for(i = 3; i < maxSieve; i += 2)
	{
  // Status update
  if ((i & (i-2)) == 1)
  	cerr << "Processed up to: " << i-1 << endl;
  // Check if the current number is prime
  if (IsPrime(i))
  {
  	// New prime found
  	numPrimes++;
  	// Cross off all multiples of the prime
  	for (j = i*i; j < max; j += (i + i))
    ByteOfBit(j) |= Bit(j);	// Mark number as composite
  }
	}
	cerr << "Done sieving. Counting primes in tail of list..." << endl;
	// Finish counting primes
	for (; i < max; i += 2)
  // Check if the current number is prime
  if (IsPrime(i))
  	numPrimes++;  	// New prime found
	cerr << "Allocating space for list of primes" << endl;
	// Allocate space to hold the list of primes
	unsigned int *primeList = new unsigned int[numPrimes];
	// Check for allocation error
	if (primeList == 0)
	{
  cerr << "Error allocating memory to store list of primes (" << numPrimes << " primes found)" << endl;
  delete [] list;    // Free the sieved list of numbers
  return 0;
	}
	// Add 2 to the list of primes
	primeList[0] = 2;
	j = 1;
	// Scan numbers for primes and add to the list
	for (i = 3; i < max; i += 2)
  if (IsPrime(i))    // Check if the current number is prime
  {
  	primeList[j] = i;  // Add the number to the list
  	j++;
  }
	delete [] list;    	// Free the temporary list of numbers that was sieved
	return primeList;    // Return the list of primes
}
int main()
{
	unsigned int numPrimes;
	// Generate the list of primes
	unsigned int *primes = GenerateListOfPrimes(1000, numPrimes);
	// Display how many primes were found
	cerr << endl << "Number of primes found:" << numPrimes << endl;
	// Display the list of primes
	if (numPrimes < 100)
	{
  cout << "The first few primes are:" << endl;
  unsigned int i;
  for (i = 0; i < numPrimes; i++)
  	cout << primes[i] << " ";
	}
	if (numPrimes > 100)
	{
  cout << "The last few primes are:" << endl;
  unsigned int i;
  for (i = numPrimes-100; i < numPrimes; i++)
  	cout << primes[i] << " ";
	}
	
	cout << endl << "Writing primes to file..." << endl;
	// Write the list of primes to a file
	ofstream outFile("Primes");
	outFile.write((char*)primes, numPrimes*sizeof(int));
	outFile.close();
	cout << "Done." << endl;
	
	return 0;
}
Just copy and paste into a C++ file. It mostly looks big because of all my output code to see what the algorithm is doing. I didn't want to get stuck waiting for hours for it to finish since I had now idea how fast it would be.