Misurazioni di Tor

We actually don't count users, but we count requests to the directories that clients make periodically to update their list of relays and estimate user numbers indirectly from there.

No, but we can see what fraction of directories reported them, and then we can extrapolate the total number in the network.

We put in the assumption that the average client makes 10 such requests per day. A tor client that is connected 24/7 makes about 15 requests per day, but not all clients are connected 24/7, so we picked the number 10 for the average client. We simply divide directory requests by 10 and consider the result as the number of users. Another way of looking at it, is that we assume that each request represents a client that stays online for one tenth of a day, so 2 hours and 24 minutes.

La media degli utenti simultanei, stimata dai dati raccolti durante il giorno. Non possiamo sapere quanti utenti distinti si sono.

No, the relays that report these statistics aggregate requests by country of origin and over a period of 24 hours. The statistics we would need to gather for the number of users per hour would be too detailed and might put users at risk.

Then we count those users as one. We really count clients, but it's more intuitive for most people to think of users, that's why we say users and not clients.

No, because that user updates their list of relays as often as a user that doesn't change IP address over the day.

The directories resolve IP addresses to country codes and report these numbers in aggregate form. This is one of the reasons why tor ships with a GeoIP database.

Pochissimi bridge riportano dati sui trasporti o sulle versioni IP, e di base consideriamo che le richieste usino il protocollo predefinito OR e IPv4. Quando più bridge riporteranno questi dati, i numeri saranno più accurati.

Relays and bridges report some of the data in 24-hour intervals which may end at any time of the day.
And after such an interval is over relays and bridges might take another 18 hours to report the data.
We cut off the last two days from the graphs, because we want to avoid that the last data point in a graph indicates a recent trend change which is in fact just an artifact of the algorithm.

The reason is that we publish user numbers once we're confident enough that they won't change significantly anymore. But it's always possible that a directory reports data a few hours after we were confident enough, but which then slightly changed the graph.

Abbiamo archivi descrittivi prima di quella data, ma non contengono tutti i dati che usiamo per stimare il numero di utenti. Vedi la seguente tarball per maggiori dettagli:

Tarball

For direct users, we include all directories which we didn't do in the old approach. We also use histories that only contain bytes written to answer directory requests, which is more precise than using general byte histories.

Oh, è tutta un'altra storia. Abbiamo scritto un report tecnico di 13 pagine che spiega i motivi del ritiro del vecchio approccio.
tl;dr: nel vecchio approccio misuravamo la cosa sbagliata, invece ora quella giusta.

Gestiamo un sistema anonimo di rilevazione censure che controlla il numero stimato di utenti in una serie di giorni e prevede il numero di essi nei giorni successivi. Se il numero effettivo è più alto o più basso, potrebbe indicare un possibile evento di censura o un'attenuazione di essa. Per maggiori dettagli, vedi il nostro report tecnico.