* node: decrease retry conn timeout in test
Should fix#3256
The retry timeout was set to the default, which is the same as the
accept timeout, so it's no wonder this would fail. Here we decrease the
retry timeout so we can try many times before the accept timeout.
* p2p: increase handshake timeout in test
This fails sometimes, presumably because the handshake timeout is so low
(only 50ms). So increase it to 1s. Should fix#3187
* privval: fix race with ping. closes#3237
Pings happen in a go-routine and can happen concurrently with other
messages. Since we use a request/response protocol, we expect to send a
request and get back the corresponding response. But with pings
happening concurrently, this assumption could be violated. We were using
a mutex, but only a RWMutex, where the RLock was being held for sending
messages - this was to allow the underlying connection to be replaced if
it fails. Turns out we actually need to use a full lock (not just a read
lock) to prevent multiple requests from happening concurrently.
* node: fix test name. DelayedStop -> DelayedStart
* autofile: Wait() method
In the TestWALTruncate in consensus/wal_test.go we remove the WAL
directory at the end of the test. However the wal.Stop() does not
properly wait for the autofile group to finish shutting down. Hence it
was possible that the group's go-routine is still running when the
cleanup happens, which causes a panic since the directory disappeared.
Here we add a Wait() method to properly wait until the go-routine exits
so we can safely clean up. This fixes#2852.
There's a time window after we call RotateFile() where autofile#index+1
does not exist. It will be created during the next call to Write(). BUT
if somebody calls NewReader() before Write(), it will fail with "open
/tmp/wal#index+1/wal: no such file or directory"
We must create file (either by calling gr.Head.openFile() or directly)
during NewReader() to ensure read calls succeed.
Closes#2538
* fix Group.RotateFile need call Flush() before rename. #2428
* fix some review issue. #2428
refactor Group's config: replace setting member with initial option
* fix a handwriting mistake
* fix a time window error between rename and write.
* fix a syntax mistake.
* change option name Get_ to With_
* fix review issue
* fix review issue
1) no need to stop the ticker in createTestGroup() method
2) now there is a symmetry - we start the ticker in OnStart(), we stop it
in OnStop()
Refs #2072
short: flushing the bufio buffer is not enough to ensure data
consistency.
long:
Saving an entry to the WAL calls writeLine to append data to the
autofile group backing the WAL, then calls group.Flush() to flush that
data to persistent storage. group.Flush() in turn proxies to
headBuf.flush(), flushing the active bufio.BufferedWriter. However,
BufferedWriter wraps a Writer, not another BufferedWriter, and the way
it flushes is by calling io.Writer.Write() to clear the BufferedWriter's
buffer. The io.Writer we're wrapping here is AutoFile, whose Write
method calls os.File.Write(), performing an unbuffered write to the
operating system, where, I assume, it sits in the OS buffers awaiting
sync. This means that Wal.Save does not, in fact, ensure the saved
operation is synced to disk before returning.