I am using NTP 4.8.2p15 with VxWorks. The current issue that I have with NTP is that when I configure more than 3 servers (my box allows to configure 5 servers), the daemon will always discard servers configured above 3, as in this example (2 out of 5), i.e., cannot use more than 3 servers, otherwise NTP
client won't sync to all servers.
remote refid st t when poll reach delay offset jitter
==============================================================================
127.127.1.0 .LOCL. 5 l 101m 8 0 0.000 +0.000 0.000 -192.168.1.143 10.32.35.198 3 u 410 512 377 0.532 -3.020 0.460 +2620:11b:d06d:f 10.176.6.101 3 u 48 512 377 0.757 +1.801 0.472
-192.168.1.140 149.56.19.163 3 u 81 512 377 0.553 +2.506 0.593 *192.168.1.146 10.176.6.101 3 u 82 512 377 0.776 +1.880 0.469 +192.168.1.204 10.176.6.101 3 u 374 512 377 0.545 +1.867 0.440
After some time, the daemon may resync with one of the rejected servers, but it keeps rejecting at least one of the servers.
remote refid st t when poll reach delay offset jitter
==============================================================================
127.127.1.0 .LOCL. 5 l 192m 8 0 0.000 +0.000 0.000 -192.168.1.143 10.32.35.198 3 u 87 1024 377 0.553 -7.216 2.259 +2620:11b:d06d:f 10.176.6.101 3 u 825 1024 377 0.676 -1.294 1.693
*192.168.1.140 149.56.19.163 3 u 266 1024 377 0.595 -3.259 3.592 +192.168.1.146 10.176.6.101 3 u 233 1024 377 0.828 -2.009 2.260 +192.168.1.204 10.176.6.101 3 u 1049 1024 377 0.553 -2.856 2.957
This is another instance from a different box but with same 5 servers configured:
remote refid st t when poll reach delay offset jitter
==============================================================================
127.127.1.0 .LOCL. 5 l 117m 8 0 0.000 +0.000 0.000 *2620:11b:d06d:f 10.176.6.101 3 u 232 256 377 1.273 +0.107 0.395
-192.168.1.146 10.176.6.101 3 u 138 512 377 1.302 +0.342 0.214 +192.168.1.140 149.56.19.163 3 u 211 512 377 1.092 -0.059 0.166 +192.168.1.204 10.176.6.101 3 u 8 512 377 1.123 -0.395 0.339 -192.168.1.143 10.32.35.198 3 u 246 256 377 1.176 -4.317 0.308
Daemon config:
ntpd.config.param=restrict 127.0.0.1;server 127.127.1.0 minpoll 3 maxpoll 3 iburst;server 2620:11b:d06d:f10a:4a4d:7eff:fea2:b2d1 iburst minpoll 6 maxpoll 10;server 192.168.1.146 iburst minpoll 6 maxpoll 10;server 192.168.1.140 iburst minpoll 6 maxpoll 10;server 192.168.1.204 iburst minpoll 6 maxpoll 10; server 192.168.1.243 iburst minpoll 6 maxpoll 10;
ntpd.init.param=-g -f /tffs0/ntpd_driftfile
I checked reported bugs against 4.8.2p15 to .p18 and did not find any related to this issue.
Note that if I configure only 3 servers (any) out of the above 5, NTP daemon synchs to all 3 with no issue.
Is it a known limitation with NTP daemon when there are more than 3 servers configured? Or is this an expected behavior due for example to changes in network latency, offset, jitter…? Anyone else have similar issue?
Also, another issue I came across is that if NTP is configured as client + server, it takes around 5 minutes to converge to sync up with Local system clock if NTP client cannot sync up with the configured server. Anything I can
do to speed up the sync up with local system clock in this case?
Any insights and help to solve those issues will be appreciated.
Thanks,
RC
I am using NTP 4.8.2p15 with VxWorks. The current issue that I have with5
NTP
is that when I configure more than 3 servers (my box allows to configure =
servers), the daemon will always discard servers configured above 3, as i=n
this example (2 out of 5), i.e., cannot use more than 3 servers, otherwis=e
NTP
client won't sync to all servers.
it keeps rejecting at least one of the servers.
configured:
ntpd.config.param=3Drestrict 127.0.0.1;server 127.127.1.0 minpoll 3 maxpo=ll 3
iburst;server 2620:11b:d06d:f10a:4a4d:7eff:fea2:b2d1 iburst minpoll 6
maxpoll
10;server 192.168.1.146 iburst minpoll 6 maxpoll 10;server 192.168.1.140 iburst minpoll 6 maxpoll 10;server 192.168.1.204 iburst minpoll 6 maxpoll
10;
server 192.168.1.243 iburst minpoll 6 maxpoll 10;
ntpd.init.param=3D-g -f /tffs0/ntpd_driftfile
relatedrs
to this issue.
Note that if I configure only 3 servers (any) out of the above 5, NTP
daemon
synchs to all 3 with no issue.
Is it a known limitation with NTP daemon when there are more than 3 serve=
configured? Or is this an expected behavior due for example to changes i=n
network latency, offset, jitter=E2=80=A6? Anyone else have similar issue=?
Also, another issue I came across is that if NTP is configured as client =+
server, it takes around 5 minutes to converge to sync up with Local syste=m
clock if NTP client cannot sync up with the configured server. Anything =I
can
do to speed up the sync up with local system clock in this case?
Thanks,
RC
Thanks @pessimus192 for your reply. Just for own clarification, do I need to increase minclock to 6 or maxclock to 6?
On my system, minclock = 3 and maxclock is 10.
From documentation:minclock minclock
Hi Harlan,
You mean the following:
+ : included by the combine algorithm
# : backup (more than tos maxclock sources)
' ': (a space) discarded as not valid (TEST10-TEST13)
x : discarded by intersection algorithm
. : discarded by table overflow (not used)
- : discarded by the cluster algorithm
In our case, the servers above 3 are being discarded with the - sign, i.e., by
the cluster algorithm.
From documentation:
minclock minclock
Specify the number of servers used by the clustering algorithm as the minimum to include on the candidate list. The default is 3. This is also the number of
servers to be averaged by the combining algorithm
So from the above, I understand that I need to increase minclock from 3 to 6. Is this correct?
wrote:
The minclock is related to how often the server is being queried.
Raising it merely means you are asking less frequently.
On Mon, Feb 17, 2025 at 4:40 PM Danny Mayer <questions@lists.ntp.org> wrote:
The minclock is related to how often the server is being queried.
Raising it merely means you are asking less frequently.
Hi Danny, this is actually not right. From the fine manual: "tos
minclock /minclock/ [...] Specify the number of servers used by the clustering algorithm as the minimum to include on the candidate list.
The default is 3. This is also the number of servers to be averaged by
the combining algorithm." You might be thinking of the "minpoll"
parameter, used to adjust peer polling interval.
On Mon, Feb 17, 2025, 14:40 Danny Mayer <questions@lists.ntp.org <mailto:questions@lists.ntp.org>> wrote:
No. They are being thrown out because the offset is too far outside the
ones compared to the rest of the list.
The minclock is related to how often the server is being queried.
Raising it merely means you are asking less frequently.
Then, please forgive this fool because I thought:
Minpoll was how frequently* you will try to get data from clocks (in 2^n seconds)
Maxpoll was how infrequently* by the same metric.
Minsane was how many clocks need to agree for good time.
Minclock was how many clocks are needed including those who disagree. Maxclock was how many can get to the quorum at most.
* barring burst and iburst which I consider not very nice.
I changed the minclock on my system from 3 to 6 and looks this solves the issue of discarding the servers above 3.
For the second issue, using the 'tos orphan 10 orphanwait 0' did not help. If anyone has other ideas, please let me know.
Hi Harlan,
You mean the following:
+ : included by the combine algorithm
# : backup (more than tos maxclock sources)
' ': (a space) discarded as not valid (TEST10-TEST13)
x : discarded by intersection algorithm
. : discarded by table overflow (not used)
- : discarded by the cluster algorithm
In our case, the servers above 3 are being discarded with the - sign, i.e., by
the cluster algorithm.
From documentation:minclock minclock
Specify the number of servers used by the clustering algorithm as the minimum to include on the candidate list. The default is 3. This is also the number of
servers to be averaged by the combining algorithm
So from the above, I understand that I need to increase minclock from 3 to 6. Is this correct?
Thank you all for the replies. I still have not seen a confirmation/or objection of whether I should increase minclock from 3 to 6... But I tried this change on my system and now I see NTP daemon is including all 5 servers and no discarding any one. So, looks this change solves the issue I was having
with NTP.
As for speeding up the convergence of NTP to sync up with the local system clock when it cannot reach any of the configured external servers, 'tos orphan
10 orphanwait 0' did not unfortunately help. It still takes ~5 minutes to converge. Any ideas?
Yes, in theory I should not bother about this NTP behavior as this is after all controlled by the selection algorithm based on quality of communication with the server, precision,....etc. However, what made me follow through this
issue mainly this:
We allow user to configure up to 5 time sources on our system, and so if all 5
time servers are supposedly good, then the expectation is that one of them will be selected as the system peer and the remaining ones as backup.
Not
having this behavior will create confusion and raise questions. I understand that the system peer may change as NTP polls the servers, but still one will be selected as system peer and the remaining will be as backup (of course assuming all 5 are still considered as good time sources per the selection algorithm!).
We already have maxclock set to 10 on our system (default value), so why I need to bump it up further?
What's wrong if I bump up minclock? Isn't this the value that allows to include more servers to the selection process, which is what I found with my testing?
If we increase the number of servers into the selection process, NTP should still include only the good sources and discard the bad ones, but not sure if this may slow down the selection process as the more servers we have the more calculations NTP will need to make before deciding which ones to keep and/or discard?
We already have maxclock set to 10 on our system (default value), so why I need to bump it up further?
What's wrong if I bump up minclock? Isn't this the value that allows to include more servers to the selection process, which is what I found with my testing?
If we increase the number of servers into the selection process, NTP should still include only the good sources and discard the bad ones, but not sure if this may slow down the selection process as the more servers we have the more calculations NTP will need to make before deciding which ones to keep and/or discard?
We allow user to configure up to 5 time sources on our system, and so if all 5
time servers are supposedly good, then the expectation is that one of them will be selected as the system peer and the remaining ones as backup. Not
From Documentation:The cluster algorithm operates in a series of rounds. The rounds are continued until a specified termination condition is met.
When we have more than 3 servers configured (5 in our case), increasing the minclock from 3 to 6 should help keep all the 5 survivors by the cluster algorithm and terminate the pruning rounds faster, based on my understanding of the following snippets from documentation. And so this should lead to tagging the additional survivors in ntpq with a '+' sign instead of '-' sign as the additional survivors won't be discarded (as they are no longer pruned).
Correct?
I changed the minclock on my system from 3 to 6 and looks this solves the issue of discarding the servers above 3.
If
anyone has other ideas, please let me know.
From the different answers I got, increasing the minclock helps replace the '-' signs with the '+' signs against those previously discarded servers, butit may make our client more fragile. So, what I take from all those discussions that we better keep the minclock as 3 and then just document this behavior as expected!
I can say we are focusing more on the count of '+' and '-' signs. We may obviously be so paranoid by this, however the concern raised is that with 2 servers out of 5 showing as discarded ('-' sign) almost all the time, our clients will have questions raised when they know that those discarded servers
are working fine with other boxes (with '*' or '+' sign).
I did not run any statistical tests with those servers.
From the different answers I got, increasing the minclock helps replace the'-' signs with the '+' signs against those previously discarded servers, but it may make our client more fragile. So, what I take from all those discussions that we better keep the minclock as 3 and then just document this behavior as expected!
I can say we are focusing more on the count of '+' and '-' signs. We may obviously be so paranoid by this, however the concern raised is that with 2 servers out of 5 showing as discarded ('-' sign) almost all the time, our clients will have questions raised when they know that those discarded servers
are working fine with other boxes (with '*' or '+' sign).
I did not run any statistical tests with those servers.
From the different answers I got, increasing the minclock helps replace the'-' signs with the '+' signs against those previously discarded servers, but it may make our client more fragile. So, what I take from all those discussions that we better keep the minclock as 3 and then just document this behavior as expected!
rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:
I can say we are focusing more on the count of '+' and '-' signs. We may
obviously be so paranoid by this, however the concern raised is that with 2 >> servers out of 5 showing as discarded ('-' sign) almost all the time, our
clients will have questions raised when they know that those discarded servers
are working fine with other boxes (with '*' or '+' sign).
It sounds to me like neither you nor your clients have a clue how ntp
works and what the + and - signs actually mean to the overall operation
of ntp.
I did not run any statistical tests with those servers.
From the different answers I got, increasing the minclock helps replace the >> '-' signs with the '+' signs against those previously discarded servers, but >> it may make our client more fragile. So, what I take from all thosediscussions that we better keep the minclock as 3 and then just document this
behavior as expected!
Actually, the behavior is already well described and documented, which
is why I question your obsession over it.
Counting + and - signs tells you absolutely nothing about how accurate
the system clock is.
If you actually want to know how accurate the clock is, at least use something like ntpstat which will give information like:
synchronised to UHF radio at stratum 1
time correct to within 2 ms
polling server every 16 s
Better is to use ntpviz which will graph things and give in depth
reports.
On 2025-02-22, Jim Pennino <jimp@gonzo.specsol.net> wrote:
rcheaito via questions Mailing List <questions@lists.ntp.org> wrote:
I can say we are focusing more on the count of '+' and '-' signs. We may >>> obviously be so paranoid by this, however the concern raised is that with 2 >>> servers out of 5 showing as discarded ('-' sign) almost all the time, our >>> clients will have questions raised when they know that those discarded servers
are working fine with other boxes (with '*' or '+' sign).
It sounds to me like neither you nor your clients have a clue how ntp
works and what the + and - signs actually mean to the overall operation
of ntp.
I did not run any statistical tests with those servers.
From the different answers I got, increasing the minclock helps replace the >>> '-' signs with the '+' signs against those previously discarded servers, butit may make our client more fragile. So, what I take from all those
discussions that we better keep the minclock as 3 and then just document this
behavior as expected!
Actually, the behavior is already well described and documented, which
is why I question your obsession over it.
Counting + and - signs tells you absolutely nothing about how accurate
the system clock is.
If you actually want to know how accurate the clock is, at least use
something like ntpstat which will give information like:
synchronised to UHF radio at stratum 1
time correct to within 2 ms
polling server every 16 s
UHF is a terrible server. Its precision is 2ms. Its accuracy is way
worse than that.
It is not a server you can determine how long the round
trip time is (Your system to the UHF radio station, and back to your
server.
If you want to determine the accuracy of your time, get a gps
time receiver. GPS knows both where you are and where the sattelite is,
and thus can accuratly determine the one way distance and the one way
time lag (well, modulo the atmospheric lag fluctuation due to the
ionisphere, and the water vapour changes in the air). Also while you are
at it you can use the gps time and one of your servers, and get and
accuracy of microseconds, not milliseconds.One GPS clock can be worth a million internet servers.
Why ntpstat calls it UHF radio I have no clue.
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 1,030 |
Nodes: | 10 (0 / 10) |
Uptime: | 75:33:13 |
Calls: | 13,351 |
Calls today: | 3 |
Files: | 186,574 |
D/L today: |
8,443 files (2,095M bytes) |
Messages: | 3,358,929 |