I can think of two possible things you could do, depending on what you want.  But first, I would like a little clarification on what you mean by "over time, the threshold becomes too large."  Is this a fixed historical threshold, and your data points are showing a trend away from the historic mean, which takes them out of center of the gaussian distribution?  Or are you updating your acceptance threshold with the new data, and the threshold itself is growing or shrinking?Here are the two possibilities I can see:
1)  (Less informative, but easier)  In theory, pure Gaussian white noise should have an absolutely flat spectrum (constant magnitude, phase uniformly distributed on (-pi,pi)).  Since your data has periodicities, there will be spikes (or maybe broader peaks, depending on the quality of the data) in the spectrum at the frequencies of these periodicities.  If there is a long-term trend that is not periodic, it will show up as a spike at frequency zero.  If you simply process the spectrum to remove these spikes, what you will have left is just the noise, and in this case, the noise is what you want, because it contains the information about the amplitude of the random variation.
How should you kill the spikes?  I think it doesn't really matter very much; basically you just want to remove the trends and periodicities from the data.  You could simply truncate the values in the spectrum to the height of the noise far from the spikes.  This might not remove all the excessive spectral density near the spikes, however, it is just the easiest.  A better way would be to normalize the spectrum by computing a running local density (say, for example, the r.m.s. of the absolute value of 5 adjacent frequencies) and then divide the spectrum by this local density.  You might use more or less than 5 adjacent frequencies, depending on what your data looks like.  You need to have the local average run over a set of frequencies narrower than the spikes, so that the spikes will be fully suppressed, but at the same time, the average should be across as wide a set of frequencies as possible, to get a smoother average density function.
Now invert the flattened spectrum to get a processed time series containing just noise with no cycles or trends.  From this noise, you can get your gaussian pdf as usual, and use that (again, just as usual) to get a threshold by which you can assess your new data points.  By the way, the new data themselves should have already been included in the set of data that you took the FFT of; that way, these data have already been processed to remove trends and cycles.  In other words, you just look at the last few data points in your processed time series and compare them to your newly computed threshold.
You can also get at the cycles and trends by subtracting your processed time series from the unprocessed one.  The result should be a fairly smooth set of points showing the cycles and trends.  Alternatively, you could subtract the processed FFT from the original FFT, and the result will be just the spikes, whose frequency, amplitude, and phase will let you identify each cycle and trend.
2) (More work to set up, but nicely automates the process) Use the technique called Kalman filtering.  This is something I just learned about last spring when a student asked me to help her to understand it.  I am not by any stretch an expert on this, just a beginner!  However, what Kalman filtering does is to fit data to a function by giving an estimate of the parameters and the residuals.  What is nice about the Kalman filter is that it does this in an incremental way, as opposed to least squares fitting, which has to process a whole batch of data at once.  
To make this work, you need a fitting function.  You can take something very general, like At + B + Csin(Dt + E) +Fsin(Ft + G), to accomodate a linear trend and a couple of periodicities.  It is not necessary to know the values A through G, as the Kalman filter finds the best estimates of them for you as it proceeds.  You do have to guess some initial values, the closer to the truth the better, of course.  Now subtract the fitted function from you actual time series data, and the result should be a nice uniform white Gaussian noise, the last few point of which will be the noise value of your newest data.  As before, use this to set up you threshold, and then assess you new data.
This method sounds shorter than method 1, only because I have not described the Kalman filter in detail.  I'll admit right now that I found the Kalman filter challenging to understand at a first reading.  You really need to think hard about what the filter is doing at each stage of the calculation in order to understand it well enough to implement it and use it.  My student had to implement it in Matlab, and we eventually did get a working Kalman filter going, but it is a rather tricky process to set up.  The basic operation of the algorithm is to take the data points one at a time, and use them to update existing estimates of the parameters (A through G in the example fitting function).  After each data point is read, several error estimates and predictors are computed, parameters are updated, and some further error estimates are set up ready for the next data point.
The advantage of this as opposed to simple least squares fitting or the FFT method, is that you never have to reprocess your data.  Suppose for example, in the FFT method, you had 1000 data points, and you want to add a few more, say another 10.  Then you have to FFT the whole set of 1010 data points, renormalize your spectrum, reinvert, compute your threshold, and finally assess your new data points.  With the Kalman filter, you just save the final parameters and error estimate matrices from your previous 1000 data points, and run the Kalman filter on your next 10 data points starting from where you left off.  As you see, this is much, much less processing, since you don't have to do anything over again, only process the new data.  I was very impressed with the cleverness of the method when I first learned it, as you can probably tell.
I'm sorry that I can't give you any better reference than just the name "Kalman filter," but I learned about it from a course handout my student got from her professor.  It's probably documented on the Web somewhere, since the method is about 30 years old and apparently well known in the world of time series data processing.  It was only new to me.
I hope this is some help to you!  Let me know if you have any questions about all this.
--Stuart Anderson