my tech blog - Anything I found worthy to write down.

Pairing a mobile phone with VW RCD 510 / Kufatec Fiscon car radio

I struggled with this a bit, and ended up doing it right by guessing. Even though I should have read the manual to begin with.

So the procedure is simple (cited from manual, page 7, “Pairing”):

Turn the ignition on.
Make sure the Bluetooth feature of your phone is turned on.
Start the pairing procedure on your mobile phone.
When prompted for a passkey, enter 1234 on your mobile phone

The crucial hint is that nothing is expected to happen in the “Phone” any setup menu, as shown in many video tutorials.

So on an Android phone, open Settings > Bluetooth and make it search for devices. Once it finds it, enter the 1234 passcode (it’s was actually suggested, that and 0000. So it was 1234). Don’t expect anything to happen on the car radio’s side nor the dashboard display until the phone is paired.

I managed to pair two phones (the manual says there are up to five allowed).

Posted Under: Android,bluetooth,miscellaneous
This post was written by eli on March 26, 2016 Comments (3)

Gigabit tranceivers on FPGA: Selected topics

Introduction

This is a summary of a few topics that should to be kept in mind when a Multi-Gigabit Tranceiver (MGT) is employed in an FPGA design. It’s not a substitute for reading the relevant user guide, nor a tutorial. Rather, it’s here to point at issues that may not be obvious at first glance.

The terminology and signal names are those used with Xilinx FPGA. The tranceiver is referred to as GTX (Gigabit Transceiver), but other variants of transceivers, e.g. GTH and GTZ, are to a large extent the same components with different bandwidth capabilities.

Overview

GTXs, which are the basic building block for common interface protocols (e.g. PCIe and SATA) are becoming an increasingly popular solution for communication between FPGAs. As the GTX’ instance consists of a clock and parallel data interface, it’s easy to mistake it for a simple channel that moves the data to the other end in a failsafe manner. A more realistic view of the GTX’ is a front end for a modem, with the possible bit errors and a need to synchronize serial-to-parallel data alignment at the receiver. Designing with the GTX also requires attention to classic communication related topics, e.g. the use of data encoding, equalizers and scramblers.

As a result, there are a few application-dependent pieces of logic that needs to be developed to support the channel:

The possibility of bit errors on the channel must be handled
The alignment from a bit stream to a parallel word must be taken care of (which bit is the LSB of the parallel word in the serial stream?)
If the transmitter and receiver aren’t based on a common clock, a protocol that injects and tolerates idle periods on the data stream must be used, or the clock difference will cause data underflows or overflows. Sending the data in packets in a common solution. In the pauses between these packets, special skip symbols must be inserted into the data stream, so that the GTX’ receiver’s clock correction mechanism can remove or add such symbols into the stream presented to the application logic, which runs at a clock slightly different from the received data stream.
Odds are that a scrambler needs to be applied on the channel. This requires logic that creates the scrambling sequence as well as synchronizes the receiver. The reason is that an equalizer assumes that the bit stream is uncorrelated on the average. Any average correlation between bit positions is considered ISI and is “fixed”. See Wikipedia

Having said the above, it’s not uncommon that no bit errors are ever observed on a GTX channel, even at very high rates, and possibly with no equalization enabled. This can’t be relied on however, as there is in fact no express guarantee for the actual error probablity of the channel.

Clocking

The clocking of the GTXs is an issue in itself. Unlike the logic fabric, each GTX has a limited number of possible sources for its reference clock. It’s mandatory to ensure that the reference clock(s) are present in one of the allowed dedicated inputs. Each clock pin can function as the reference clock of up to 12 particular GTXs.

It’s also important to pay attention to the generation of the serial data clocks for each GTX from the reference clock(s). It’s not only a matter of what multiplication ratios are allowed, but also how to allocate PLL resources and their access to the required reference clocks.

QPLL vs. CPLL

Two types of PLLs are availble for producing the serial data clock, typically running at severtal GHz: QPLLs and CPLLS.

The GTXs are organized in groups of four (“quads”). Each quad shares a single QPLL (Quad PLL), which is instantiated separately (as a GTXE2_COMMON). In addition, each GTX has a dedicated CPLL (Channel PLL), which can generate the serial clock for that GTX only.

Each GTX may select its clock source from either the (common) QPLL or its dedicated CPLL. The main difference between these is that the QPLL covers higher frequencies. High-rate applications are hence forced to use the QPLL. The downside is that all GTXs sharing the same QPLL must have the same data rate (except for that each GTX may divide the QPLL’s clock by a different rate). The CPLL allow for a greater flexibility of the clock rates, as each GTX can pick its clock independently, but with a limited frequency range.

Jitter

Jitter on the reference clock(s) is the silent killer of GTX links. It’s often neglected by designers because “it works anyhow”, but jitter on the reference clock has a disastrous effect on the channel’s quality, which can be by far worse than a poor PCB layout. As both jitter and poor PCB layout (and/or cabling) contribute to the bit error rate and the channel’s instability, the PCB design is often blamed when things go bad. And indeed, playing with the termination resistors or similar black-magic actions sometimes “fix it”. This makes people believe that GTX links are extremely sensitive to every via or curve in the PCB trace, which is not the case at all. It is, on the other hand, very sensitive to the reference clock’s jitter. And with some luck, a poorly chosen reference clock can be compensated for with a very clean PCB trace.

Jitter is commonly modeled as a noise component which is added to the timing of the clock transition, i.e. t=kT+n (n is the noise). Consequently, it is often defined in terms of the RMS of this noise component, or a maximal value which is crossed at a sufficiently low probability. The treatment of an GTX’ reference clock requires a slightly different approach; the RMS figures are not necessarily a relevant measures. In particular, clock sources with excellent RMS jitter may turn out inadequate, while other sources, with less impressive RMS figures may work better.

Since the QPLL or CPLL locks on this reference clock, jitter on the reference clock results in jitter in the serial data clock. The prevailing effect is on the transmitter, which relies on this serial data clock; the receiver is mainly based on the clock it recovers from the incoming data stream, and is therefore less sensitive to jitter.

Some of the jitter – in particular “slow” jitter (based upon low frequency components) is fairly harmless, as the other side’s receiver clock synchronization loop will cancel its effect by tracking the random shifts of the clock. On the other hand, very fast jitter in the reference clock may not be picked up by the QPLL/CPLL, and is hence harmless as well.

All in all, there’s a certain band of frequency components in the clock’s timing noise spectrum, which remains relevant: The band that causes jitter components which are slow enough for the QPLL/CPLL to track and hence present on the serial data clock, and too fast for the receiver’s tracking loop to follow. The measurable expression for this selective jitter requirement is given in terms of phase noise frequency masks, or sometimes as the RMS jitter in bandwidth segments (e.g. PCIe Base spec 2.1, section 4.3.7, or Xilinx’ AR 44549). Such spectrum masks required for GTX published by the hardware vendors. The spectral behavior of clock sources is often more difficult to predict: Even when noise spectra are published in datasheets, they are commonly given only for certain scenarios as typical figures.

8b/10b encoding

Several standardized uses of MGT channels (SATA, PCIe, DisplayPort etc.) involve a specific encoding scheme between payload bytes for transmission and the actual bit sequence on the channel. Each (8-bit) byte is mapped to an 10-bit word, based upon a rather peculiar encoding table. The purposes of this encoding is to ensure a balance between the number of 0′s and 1′s on the physical channel, allowing AC-coupling of the electrical signal. Also, this encoding also ensures frequent toggling between 0′s and 1′s, which ensures the proper bit synchronization at the receiver by virtue of the of the clock recovery loop (“CDR”). Other things that are worth noting about this encoding:

As there are 1024 possible code words covering 256 possible input bytes, some of the excessive code words are allocated as control characters. In particular, a control character designated K.28.5 is often referred to as “comma”, and is used for synchronization.
The 8b/10b encoding is not an error correction code despite its redundancy, but it does detect some errors, if the received code word is not decodable. On the other hand, a single bit error may lead to a completely different decoded word, without any indication that an error occurred.

Scrambling

To put it short and concise: If an equalizer is applied, the user-supplied data stream must be random. If the data payload can’t be ensured to be random itself (this is almost always the case), a scrambler must be defined in the communication protocol, and applied in the logic design.

Applying a scrambler on the channel is a tedious task, as it requires a synchronization mechanism between the transmitter and receiver. It’s often quite tempting to skip it, as the channel will work quite well even in the absence of a scrambler, even where it’s needed. However in the long run, occasional channel errors are typically experienced.

The rest of this paragraph attempts to explain the connection between the equalizer and scrambler. It’s not the easiest piece of reading, so it’s fine to skip it, if my word on this is enough for you.

In order to understand why scrambling is probably required, it’s first necessary to understand what an equalizer does.

The problem equalizers solve is the filtering effect of the electrical media (the “channel”) through which the bit stream travels. Both cables and PCBs reduce the strength of the signal, but even worse: The attenuation depends on the frequency, and reflections occur along the metal trace. As a result, the signal doesn’t just get smaller in magnitude, but it’s also smeared over time. A perfect, sharp, step-like transition from -1200 mV to +1200mV at the transmitter’s pins may end up as a slow and round rise from -100mV to +100mV. Because of this slow motion of the transitions at the receiver, the clear boundaries between the bits are broken. Each transmitted bit keeps leaving its traces way after its time period. This is called Inter-Symbol Interference (ISI): The received voltage at the sampling time for the bit at t=0 depends on the bits at t=-T, t=t-2T and so on. Each bit effectively produces noise for the bits coming after it.

This is where the equalizer comes in. The input of this machine is the time samples of the bit at t=0, but also a number of measured voltage samples of the bits before and after it. By making a weighted sum of these inputs, the equalizer manages, to a large extent, to cancel the Inter-Symbol Interference. In a way, it implements a reverse filter of the channel.

So how does the equalizer acquire the coefficients for each of the samples? There are different techniques for training an equalizer to work effectively against the channel’s filtering. For example, cellular phones do their training based upon a sequence of bits on each burst, which is known in advance. But when the data stream runs continuously, and the channel may change slightly over time (e.g. a cable is being bent) the training has to be continuous as well. The chosen method for the equalizers in GTXs is therefore continuous.

The Decision Feedback Equalizer, for example, starts with making a decision on whether each input bit is a ’0′ or ’1′. It then calculates the noise signal for this bit, by subtracting the measured voltage with the expected voltage for a ’0′ or ’1′, whichever was decided upon. The algorithm then slightly alters the weighted sums in a way that removes any statistical correlation between the noise and the previous samples. This works well when the bit sequence is completely random: There is no expected correlation between any input sample, and if such exists, it’s rightfully removed. Also, the adaptation converges into a compromise that works on the average best for all bit sequences.

But what happens if there is a certain statistical correlation between the bits in the data itself? The equalizer will specialize in reducing the ISI for the bit patterns occurring more often, possibly doing very bad on the less occurring patterns. The equalizer’s role is to compensate for the channel’s filtering effect, but instead, it adds an element of filtering of its own, based upon the common bit patterns. In particular, note that if a constant pattern runs through the channel when there’s no data for transmission (zeros, idle packets etc.) the equalizer will specialize in getting that no-data through, and mess up with the actual data.

One could be led to think that the 8b/10b encoding plays a role in this context, but it doesn’t. Even though cancels out DC on the channel, it does nothing about the correlation between the bits. For example, if the payload for transmission consists of zeros only, the encoded words on the channel will be either 1001110100 or 0110001011. The DC on the channel will remain zero, but the statistical correlation between the bits is far from being zero.

So unless the data is inherently random (e.g. an encrypted stream), using an equalizer means that the data which is supplied by the application to the transmitter must be randomized.

The common solution is a scrambler: XORing the payload data by a pseudo-random sequence of bits, generated by a simple state machine. The receiver must XOR the incoming data with the same sequence in order to retrieve the payload data. The comma (K28.5) symbol is often used to synchronize both state machines.

In GTX applications, the (by far) most commonly used scrambler is the G(X)=X^16+X^5+X^4+X^3+1 LFSR, which is defined in a friendly manner in the PCIe standard (e.g. the PCI Express Base Specification, rev. 1.1, section 4.2.3 and in particular Appendix C).

TX/RXUSRCLK and TX/RXUSRCLK2

Almost all signals between the FPGA logic fabric and the GTX are clocked with TXUSRCLK2 (for transmission) and RXUSRCLK2 (for reception). These signals are supplied by the user application logic, without any special restriction, except that the frequency must match the GTX’ data rate so as to avoid overflows or underflows. A common solution for generating this clock is therefore to drive the GTX’ RX/TXOUTCLK through a BUFG.

The logic fabric is required to supply a second clock in each direction, TXUSRCLK and RXUSRCLK (without the “2” suffix). These two clocks are the parallel data clocks in a deeper position of the GTX.

The rationale is that sometimes, it’s desired to let the logic fabric work with a word width which is twice as wide as the actual word width. For example, in a high-end data rate application, the GTX’ word width may be set to 40 bits with 8b/10b, so the logic fabric would interface with the GTX through a 32 bit data vector. But because of the high rate, the clock frequency may still be too high for the logic fabric, in which case the GTX allows halving the clock, and applying the data through a 80 bits word. In this case, the logic fabric supplies the 80-bit word clocked with TXUSRCLK2, and is also required to supply a second clock, TXUSRCLK having twice the frequency, and being phase aligned with TXUSRCLK2. TXUSRCLK is for the GTX’ internal use.

A similar arrangement applies for reception.

Unless the required data clock rate is too high for the logic fabric (which is usually not the case), this dual-clock arrangement is best avoided, as it requires an MMCM or PLL to generate two phase aligned clocks. Except for the lower clock applied to the logic fabric, there is no other reason for this.

Word alignment

On the transmitting side, the GTX receives a vector of bits, which forms a word for transmission. The width of this word is one of the parameters that are set when the GTX is instantiated, and so is whether 8b/10b encoding is applied. Either way, some format of parallel words is transmitted over the channel in a serialized manner, bit after bit. Unless explicitly required, there is nothing in this serial bitstream to indicate the words’ boundaries. Hence the receiver has no way, a-priori, to recover the word alignment.

The receiver’s GTX’ output consists of a parallel vector of bits, typically with the same width as the transmitter. Unless a mechanism is employed by the user logic, the GTX has no way to recover the correct alignment. Without such alignment, the organization into a parallel words arrives wrong at the receiver, and possibly as complete garbage, as an incorrect alignment prevents 8b/10b decoding (if employed).

It’s up to the application logic to implement a mechanism for synchronizing the receiver’s word alignment. There are two methodologies for this: Moving the alignment one bit at a time at the receiver’s side (“bit slipping”) until the data arrives properly, or transmitting a predefined pattern (a “comma”) periodically, and synchronize the receiver when this pattern is detected.

Bit slipping is the less recommended practice, even though simpler to understand. It keeps most of the responsibility in the application logic’s domain: The application logic monitors the arriving data, and issues a bit slip request when it has gathered enough errors to conclude that the alignment is out of sync.

However most well-established GTX-based protocols use commas for alignment. This method is easier in the way that the GTX aligns the word automatically when a comma is detected (if the GTX is configured to do so). If injecting comma characters periodically into the data stream fits well in the protocol, this is probably the preferred solution. The comma character can also be used to synchronize other mechanisms, in particular the scrambler (if employed).

Comma detection may also have false positives, resulting from errors in the raw data channel. As these data channels usually have a very low bit error probability (BER), this possibility can be overlooked in applications where a short-term false alignment resulting from a false comma detected is acceptable. When this is not acceptable, the application logic should monitor the incoming data, and disable the GTX automatic comma alignment through the rxpcommaalignen and/or rxmcommaalignen inputs of the GTX.

Tx buffer, to use or not to use

The Tx buffer is a small dual-clock (“asynchronous”) FIFO in the transmitter’s data path + some logic that makes sure that it starts off in the state of being half full.

The underlying problem, which the Tx buffer potentially solves, is that the serializer inside the GTX runs on a certain clock (XCLK) while the application logic is exposed to another clock (TXUSRCLK). The frequency of these clocks must be exactly the same to prevent overflow or underflow inside the GTX. This is fairly simple to achieve. Ensuring proper timing relationships between these two clocks is however less trivial.

There are hence two possibilies:

Not requiring a timing relationship between these clock (just the same frequency). Instead, use a dual-clock FIFO, which interfaces between these two clock domains. This small FIFO is referred to as the “Tx buffer”. Since it’s part of the GTX’ internal logic, going this path doesn’t require any additional resources from the logic fabric.
Make sure that the clocks are aligned, by virtue of a state machine. This state machine is implemented in the logic fabric.

The first solution is simpler and requires less resources from the FPGA’s logic fabric. Its main drawback is the latency of the Tx buffer, which is typically around 30 TXUSRCLK cycles. While this delay is usually negligible from a functional point of view, it’s not possible to predict its exact magnitude. It’s therefore not possible to use the Tx buffer on several parallel lanes of data, if the protocol requires a known alignment between the data in these lanes, or when an extremely low latency is required.

The second solutions requires some extra logic, but there is no significant design effort: This logic that aligns the clocks is included automatically by the IP core generator on Vivado 2014.1 and later, when “Tx/Rx buffer off” mode is chosen.

Xilinx GTX’ documentation is somewhat misleading in that it details the requirements of the state machine to painful detail: There’s no need to read through that long saga in the user guide. As a matter of fact, this logic is included automatically by the IP core generator on Vivado 2014.1, so there’s really no reason to dive into this issue. Only note that gtN_tx_fsm_reset_done_out may take a bit longer to assert after a reset (something like 1 ms on a 10 Gb/s lane).

Rx buffer

The Rx buffer (also called “Rx elastic buffer”) is also a dual-clock FIFO, which is placed in the same clock domain gap as the Tx buffer, and has the same function. Bypassing it requires the same kind of alignment mechanism in the logic fabric.

As with its Tx counterpart, bypassing the Rx buffer makes the latency short and deterministic. It’s however less common that such a bypass is practically justified: While a deterministic Tx latency may be required to ensure data alignment between parallel lanes in order to meet certain standard protocol requirements, there is almost always fairly easy methods to compesate for the unknown latency in user logic. Either way, it’s preferred not to rely on the transmitter to meet requirements on data alignment, and align the data, if required, by virtue of user logic.

Leftover notes

sysclk_in must be stable when the FPGA wakes up from configuration. A state machine that brings up the transceivers is based upon this clock. It’s referred to as the DRP clock in the wizard (find more imformation at http://www.directics.com/).
It’s important to declare the DRP clock’s frequency correctly, as certain required delays which are measured in nanoseconds are implemented by dwelling for a number of clocks, which is calculated from this frequency.
In order to transmit a comma, set the txcharisk to 1 (since it’s a vector, it sets the LSB) and the value of the 8 LSBs of the data to 0xBC, which is the code for K.28.5.

Posted Under: FPGA,GTX,PCI express
This post was written by eli on March 26, 2016 Comments (2)

LG G4 ignoring a 5 GHz Wifi hotspot

Problem: My LG G4 (Android 5.1, kernel 3.10.49) suddenly ignored my home’s 5 GHz router. It saw the neighbors’ networks all right, but not mine.

Reason for problem: ~~I had activated the phone’s hotspot previously. It seems like that locked the Wifi hardware to the 2.4 GHz band, as it happens to transmit on 2.437 GHz (channel 6).~~ Correction: It seems like the phone doesn’t detect the 5 GHz hotspot if it wasn’t present on boot. Regardless of hotspot activation.

Solution: Reboot the phone. As simple as that. That is, restart Android, with the 5 GHz hotspot present and with Wifi enabled on the phone (before rebooting or enabled soon after boot). Turning the phone off and on again would probably do the same job, but why bother. It has been suggested to turn flight mode on and off, but that didn’t work for me.

Conclusion: Treat your smartphone for what it is: A small, weak and expensive computer with a lot of silly bugs.

Update (Jun 2017): After changing to another 5 GHz channel, the phone detects the hotspot normally. It seems like the previous 5 GHz channel I used wasn’t an allowed frequency in Israel, so the phone ignored it (and sometimes didn’t). Or maybe there’s also some kind of software upgrade that has taken place since.

By the way, I tried to move the G4′s hotspot to 2.452 GHz (channel 9) in the Advanced Settings, but the reception signal on the laptop went down. Go figure.

Posted Under: Android
This post was written by eli on March 24, 2016 Comments (1)

chroot and dynamic libraries: Some jots

This is just a messed up pile of jots as I tried to solve a specific problem. The actual problem turned out to be between chair and keyboard, but I decided to post this anyhow, just in case it will be useful in the future.

The setting was like this: I had a script, which called a suid-enabled program I wrote (jailer.c), which did a chroot() to a chroot jail and then called setgroups(), setgid() and setuid(), and eventually an execl() to a bash script, which was of course inside the chroot jail.

So all in all, the program could be called from any user, but thanks to its setuid to root, it could change the root, and then turn into another user.

In the bash script, the control was eventually turned over to another program (not mine, hence the chroot protection) with the bash built-in command exec. And all was fine.

But then I needed to continue the execution after the program. So I dumped the exec and used the good old invocation by just starting the line with the program’s name. And that failed colossally.

Spoiler: The reason turned out to be that the current process ID remains when exec is used, and changes when other methods are used (duh). As some preparations to running the program had to match the process ID of the program running, exec worked, other methods didn’t. So it was really my bad.

After a while I thought I figured out that somehow, all this mucking around (playing with users? setuid? chroot?) caused the program to fail in finding its library files.

So I added a

export LD_LIBRARY_PATH=/lib64:/special/lib/lin64

line to the bash script, which made the program work. Finally. Only now it segfaulted. Well, at least I know I did something in the direction.

The problem seemed to be, that the program loaded an outdated libstdc++.so.6 file from its own library set, instead of the one in /usr/lib64/. LD_LIBRARY_PATH solved one issue, but since its paths are always handled before the regular one, it actually messed up.

Being in a chroot environment, everything is controlled, so why not add the standard libraries into LD_LIBRARY_PATH? Ugly, but nobody said being in a (chroot) jail should be nice.

So what is the regular order of loading libraries? Well, I went

$ ldconfig -v | less

and picked up the paths that live in the jail, and put them before the special paths. And then fixed LD_LIBRARY_PATH to

export LD_LIBRARY_PATH=/usr/lib64:/lib64:/special/lib/lin64

This solved the issue with libstdc++.so.6, but the segfault remained.

Epilogue: All this didn’t solve the problem, but rather kept me busy with complicated stuff, while the actual solution was so much simper. Maybe this will be useful for solving something else. Or I just wasted a few hours.

Posted Under: Linux
This post was written by eli on February 12, 2016 Comments (0)

Under the hood of Vivado runs: Some scripting essentials

Introduction

My motivation for looking inside Vivado runs was that I wanted to implement a Vivado project from within XEmacs, using the Compile button, and all that within a rather tangled Makefile-based build system. But I also wanted to leave the possibility to open the project using Vivado’s GUI, if something went wrong or needed inspection. So working in non-project mode was out of the question.

On the face of it, the solution was simple: Just execute the runme.sh scripts in the run directories. Or use launch_runs in a Tcl script. Well, that sounds simple, but there is no output to console during these runs. In particular, the implementation is completely silent. I opted out the fun of staring on the cursor for an hour or so, having no idea what’s going on during the implementation. Leaving me no option but to get my hands a bit dirty.

This was written in February 2016 and relates to Vivado 2015.2. Feel free to update stuff in the comments below.

It’s recommended to first take a look on this page, which discusses other aspects of scripting.

Preparing the runs & OOCs

Vivado runs are just an execution of a Tcl script in one of the *.runs directories. This holds true for all runs, both Out-Of-Context runs (OOCs, e.g. IP cores) as well as synthesis and implementation runs.

Say that the project’s name is myproj, and the top-level module’s name is top.v (or top.vhd, if you insist). As the project is generated, Vivado creates a directory named myproj.run, which contains a set of subdirectories, for example fifo_32x512_synth_1/, fifo_8x2048_synth_1/, synth_1/ and impl_1/. In this example, the first two directories belong to two FIFO IPs, and the other two are implementation related.

synth_1 and impl_1 are most likely generated when the project is created in Vivado’s GUI, or with create_run Tcl calls if the project is generated with a setup scripts (again, take a look on this page). This is kinda out of scope here. The thing is to create and invoke the runs for the IPs (that is, the Out-Of-Context parts, OOCs).

In my personal preference, these OOCs are added to the project with the following Tcl snippet:

foreach i $oocs {
    if [file exists "$essentials_dir/$i/$i.dcp"] {
	read_checkpoint "$essentials_dir/$i/$i.dcp"
    } else {
	add_files -norecurse -fileset $obj "$essentials_dir/$i/$i.xci"
    }
}

To make a long story short, the idea is to include the DCP file rather than the XCI if possible, so the IP isn’t re-generated if it has already been so. Which means that the DCP file has to be deleted if the IP core’s attributes have been changed, or the changes won’t take any effect.

We’ll assume that the IPs were included as XCIs, because including DCPs requires no runs.

The next step is to create the scripts for all runs with the following Tcl command:

launch_runs -scripts_only impl_1 -to_step write_bitstream

Note that thanks to the -scripts_only flag, no run executes here, but just the run directories and their respective scripts. In particular, the IPs are elaborated, or generated, at this point. But not synthesized.

Building the OOCs

It’s a waste of time to run the IPs’ synthesis one after the other, as each synthesis doesn’t depend on the other. So a parallel launch can be done as follows:

First, obtain a list of runs to be run, and reset them:

set ooc_runs [get_runs -filter {IS_SYNTHESIS && name != "synth_1"} ]

foreach run $ooc_runs { reset_run $run }

The filter grabs the synthesis target of the IPs’ runs, and skips synth_1. Resetting is done, or Vivado complains it should.

Next, launch these specific runs in parallel:

if { [ llength $ooc_runs ] } {
  launch_runs -jobs 8 $ooc_runs
}

Note that ooc_runs may be an empty list, in particular if all IPs were loaded as DCPs before. If launch_runs is called with no runs, it fails with an error. To prevent this, $ooc_runs is checked first.

And then finally, wait for all runs to finish. wait_on_run can only wait on one specific run, but it’s fine looping on all launched runs. The loop will finish after the last run has finished:

foreach run $ooc_runs { wait_on_run $run }

Finally: Implementing the project

As mentioned above, launching a run actually consists of executing runme.sh (or runme.bat on Windows, never tried it). The runme.sh shell script sets the PATH with the current Vivado executable, and then invokes the following command with ISEWrap.sh as a wrapper:

vivado -log top.vds -m64 -mode batch -messageDb vivado.pb -notrace -source top.tcl

(Recall that “top” is the name of the toplevel module)

Spoiler: Just invoking the command above will execute the run with all log output going to console, but Vivado’s GUI will not reflect that the execution took place properly. More on that below.

It’s important to note that the “vivado” executable is invoked. This is in fact the way it’s done even when launched from within the GUI or with a launch_runs Tcl command. If the -jobs parameter is given to launch_runs, it will invoke the “vivado” executable several times in parallel. If you want to convince yourself that this indeed happens, note that you get something like this in the console inside Vivado’s GUI, which is exactly what Vivado prints out when invoked from the command line:

****** Vivado v2015.2 (64-bit)
  **** SW Build 1266856 on Fri Jun 26 16:35:25 MDT 2015
  **** IP Build 1264090 on Wed Jun 24 14:22:01 MDT 2015
    ** Copyright 1986-2015 Xilinx, Inc. All Rights Reserved.

Vivado’s invocation involves three flags that are undocumented:

The -notrace flag simply means that Vivado doesn’t print out the Tcl commands it executes, which it would otherwise do by default. I drop this flag in my own scripts: With all the mumbo-jumbo that is emitted anyhow, the Tcl commands are relatively informative.
The -m64 probably means “run in 64 bit mode”, but I have no idea.
The -messageDb seems to set the default message *.pb output, which is probably some kind of database from which the GUI takes its data to present in the Message tab. Note that the main Tcl script for impl_1 (e.g. top.tcl) involves several calls to create_msg_db followed by close_msg_db, which is probably how the implementation run has messages divided into subcategories. Just my guesses, since nothing of this is documented (not even these Tcl commands).

The ISEWrap.sh wrapper is crucially important if you want to be able to open the GUI after the implementation and work as if it was done in the GUI: It makes it possible for the GUI to tell which run has started, completed or failed. Namely, it creates two files, one when the run starts, and one when it ends.

For example, during the invocation of a run, .vivado.begin.rst is created (note the “hidden file name” starting with a dot), and contains something like this:

<?xml version="1.0"?>
<ProcessHandle Version="1" Minor="0">
    <Process Command="vivado" Owner="eli" Host="myhost.localdomain" Pid="1003">
    </Process>
</ProcessHandle>

And if the process terminates successfully, another empty file is created, .vivado.end.rst. If it failed, the empty file .vivado.error.rst is created instead. The synth_1 run creates only these two, but as for impl_1, individual files are generated for each step in the implementation Tcl script by virtue of file-related Tcl commands, e.g. .init_design.begin.rst, .place_design.begin.rst etc (and also end scripts). And yes, the run system is somewhat messy in that these files are created in several different ways.

If these files aren’t generated, the Vivado GUI will get confused on whether the runs have taken place or not. In particular, the synth_1 run will stand at “Scripts Generated” even after a full implementation.

Bottom line

Recall that the reason for all this diving into the Vivado runs mechanism, was to perform these runs with log output on the console.

The ISEWrap.sh wrapper (actually, the way it’s used) is the reason why there is no output to console during the run’s execution. The end of runme.sh goes:

ISEStep="./ISEWrap.sh"
EAStep()
{
     $ISEStep $HD_LOG "$@" >> $HD_LOG 2>&1
     if [ $? -ne 0 ]
     then
         exit
     fi
}

# pre-commands:
/bin/touch .init_design.begin.rst
EAStep vivado -log top.vdi -applog -m64 -messageDb vivado.pb -mode batch -source top.tcl -notrace

The invocation of vivado is done by calling EAStep() with the desired command line as arguments. This is passed on by EAStep() to the wrapper as arguments, which in turn executes vivado as required, along with the creating of the begin-end files. But note the redirection (marked in red) to the log file. It goes there, but not to console.

So one possibility is to rewrite runme.sh slightly, and modify EAStep() so it uses the “tee” UNIX utility or doesn’t redirect at all into a log file. Or modify the wrapper for your own needs. I went for option B (there were plenty of scripts anyhow in my build system).

Posted Under: FPGA,Linux,Tcl,Vivado
This post was written by eli on February 10, 2016 Comments (3)

Password-less SSH remote login demystified

This is documented everywhere, and still, I always find myself messing around with this. So once and for all:

The files

In any user’s .ssh/ directory, there should be (among others) two files: id_rsa and id_rsa.pub. Or maybe with dsa instead of rsa. Doesn’t matter too much. These are the keys that are used when you try to login from this account to another host.

id_rsa is the secret key, and is id_rsa.pub is public. The former should be readable only by the user (and root), and the latter by anyone. If anyone has the secret key, he or she may login to whatever host that identifies you with it.

If these files aren’t in .ssh/, they can be generated with ssh-keygen. This should be done once for each new shell account you generate, and maybe even once in a lifetime: It’s much more convenient to copy these files from your old user account, or you’ll have to re-establish the automatic login on each remote server with the new key.

So it goes:

$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/eli/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/eli/.ssh/id_rsa.
Your public key has been saved in /home/eli/.ssh/id_rsa.pub.
The key fingerprint is:
77:7c:bf:4d:3b:a9:8a:e7:56:09:24:03:6f:22:d7:ca eli@myhost.localdomain
The key's randomart image is:
+--[ RSA 2048]----+
|       ..        |
|        oo .     |
|     . o ++      |
|      + +  o     |
|       ES . + o  |
|         . . + . |
|            .   +|
|          .o   ++|
|         .+o...oo|
+-----------------+

The public key file (id_rsa.pub) looks something like this:

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAs/ggsf1ZXbvyqQ7NbzIT+UDnGqo1LOgV3PpEUpVt8lw44jDgDCNGXXMZepMVwp3LgcGPKrrZ4n7b9/5zgXVrH86HZVyi+guu0IWLsYA4K+OgQY0m6rmXss/v7lt6ItIZTTJWhgTr4E8DE8+9PibYfBrvdITxdVAVl+FxmDEHhunnMzeqUsTMD7hniEWvlvHE0aE6Gp2rQPMU5sx3+LEGJ4y1BDzChrNa6dc2L7GP1ViGaP9SZBYVFPqbdkdCOOoR6N+FU/VHYIBeK5RdkTkfxGHKHfec1p8sXzveDHT69ouDaw0+c+3j2KlNq4ugnbTGKWrJaQBxQBEzvLgTdePCtQ== eli@myhost.localdomain

Note the eli@myhost.localdomain part at the end. It has no significance crypto-wise. It’s considered a comment. More about it below.

The private (secret) file, id_rsa, looks something like this (I don’t really use it, right? Don’t publish your public key!)

How it works

The gory details left aside, the authentication goes like this: When you attempt to log in, your ssh client checks your .ssh/ directory for the key files. If it finds such, it notifies the server that it wants to try these key files, and sends information on the public keys it has.

The server on the remote host looks up the user’s home directory for a .ssh/authorized_keys file. If such exists, it should find a line that is identical to the id_rsa.pub file on the client’s side. If such match is found, the server uses the public key to create a challenge for the client. The client, which has the secret key passes this challenge, and the authentication is done.

So .ssh/authorized_keys is just a concatenation of id_rsa.pub files, each line for a different key.

Now to the eli@myhost.localdomain part I mentioned above. It goes into the .ssh/authorized_keys file as well. It’s there to help people, who have several authentication keys for logging in from different computers, to keep track which line in .ssh/authorized_keys belongs to which. Just in case they wanted to delete a line or so.

Important: The home directory’s on the remote host must not be writable by anyone else than the user (and root, of course), or ssh will ignore authorized_keys. In other words, the home directory’s permission can be 0755 for example (viewable by all, but writable by user only) or more restrictive, but if it’s 0775 or 0777, password-less login will not work. Rationale: If someone else can rename and replace your .ssh directory, that someone else can log in to your account.

Making the remote host recognize you

There’s a command-line utility for this, namely ssh-copy-id. It merely uses SSH to log into the remote host (this time a password will be required, or why are you doing this?). All it does is to append id_rsa.pub to .ssh/authorized_keys on the remote host. That is, in fact, all that is required.

Alternatively, manually copy the line from id_rsa.pub into .ssh/authorized_keys.

Remember that there is no problem disclosing id_rsa.pub to anyone. It’s really public. It’s the secret file you need to keep to yourself. And it’s quite easy to tell the difference between the two.

Having multiple SSH keys

It’s sometimes required to maintain multiple SSH keys. For example, in order to access Github as multiple users.

First, create a new SSH key pair:

$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/eli/.ssh/id_rsa): id_rsa_github2

Note that the utility allows choosing a different file name. This is how a new key pair lives side-by-side with the existing one.

The next step is to create (or edit) .ssh/config. This file should have permission mode 0600 (accessible only to user) because it’s sensitive by its nature, but also because ssh may ignore it otherwise. See “man ssh_config”.

Now let’s follow the scenario of multiple keys on Github. Say that .ssh/config reads as follows:

# Github access as second user
Host github-amigo
  HostName github.com
  User git
  IdentityFile ~/.ssh/id_rsa_github2

If no entry in the config file matches, ssh uses the default settings. So existing ssh connections remain unaffected. In other words, this impacts only the host name that we’ve just invented. No need to state the default behavior explicitly. No collateral damage.

It’s of course possible to add several entries as shown above.

The setting above means is that ssh now recognizes “github-amigo” as a legit name of a host. If that name is used, ssh will connect with github.com, identify itself as “git” and use the said key.

It’s hence perfectly reasonable to connect with github.com with something like:

$ ssh github-amigo
PTY allocation request failed on channel 0
Hi amigouser! You've successfully authenticated, but GitHub does not provide shell access.
Connection to github.com closed.

The line in .git/config is accordingly

[remote "github"]
        url = github-amigo:amigouser/therepository.git
        fetch = +refs/heads/*:refs/remotes/github/*

In the url, the part before the colon is the name of the host. There is no need to state the user’s name, because ssh fills it in anyhow. After the colon, it’s the name of the repository.

A successful session

If it doesn’t work, the -v flag can be used to get debug info on an ssh session. This is what it looks like when it’s OK. YMMV.

$ ssh -v remotehost.org
OpenSSH_5.3p1, OpenSSL 1.0.0b-fips 16 Nov 2010
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Applying options for *
debug1: Connecting to remotehost.org [84.200.84.244] port 22.
debug1: Connection established.
debug1: identity file /home/eli/.ssh/identity type -1
debug1: identity file /home/eli/.ssh/id_rsa type 1
debug1: identity file /home/eli/.ssh/id_dsa type -1
debug1: Remote protocol version 2.0, remote software version OpenSSH_5.3
debug1: match: OpenSSH_5.3 pat OpenSSH*
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_5.3
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: server->client aes128-ctr hmac-md5 none
debug1: kex: client->server aes128-ctr hmac-md5 none
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
debug1: checking without port identifier
debug1: Host 'remotehost.org' is known and matches the RSA host key.
debug1: Found key in /home/eli/.ssh/known_hosts:50
debug1: found matching key w/out port
debug1: ssh_rsa_verify: signature correct
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: SSH2_MSG_SERVICE_REQUEST sent
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,password
debug1: Next authentication method: gssapi-with-mic
debug1: Unspecified GSS failure.  Minor code may provide more information
Credentials cache file '/tmp/krb5cc_1010' not found

debug1: Unspecified GSS failure.  Minor code may provide more information
Credentials cache file '/tmp/krb5cc_1010' not found

debug1: Unspecified GSS failure.  Minor code may provide more information

debug1: Next authentication method: publickey
debug1: Offering public key: /home/eli/.ssh/id_rsa
debug1: Server accepts key: pkalg ssh-rsa blen 277
debug1: Authentication succeeded (publickey).
debug1: channel 0: new [client-session]
debug1: Requesting no-more-sessions@openssh.com
debug1: Entering interactive session.
debug1: Sending environment.
debug1: Sending env XMODIFIERS = @im=none
debug1: Sending env LANG = en_US.UTF-8

And shell prompt comes next.

… and then it didn’t work

Fast forward to December 2024 and with a OpenSSH_9.2p1 client (Debian-2+deb12u3, OpenSSL 3.0.15 3 Sep 2024), I failed to log in password-less into a really old ssh server:

$ ssh -v theserver.com 
[ ... ]
debug1: Authentications that can continue: publickey,password
debug1: Next authentication method: publickey
debug1: Offering public key: /home/theuser/.ssh/id_rsa RSA SHA256:LKkjdfrgkGfdjgKJ35AfNKJYL+fIQ
debug1: send_pubkey_test: no mutual signature algorithm
debug1: Trying private key: /home/theuser/.ssh/id_ecdsa
debug1: Trying private key: /home/theuser/.ssh/id_ecdsa_sk
debug1: Trying private key: /home/theuser/.ssh/id_ed25519
debug1: Trying private key: /home/theuser/.ssh/id_ed25519_sk
debug1: Trying private key: /home/theuser/.ssh/id_xmss
debug1: Trying private key: /home/theuser/.ssh/id_dsa
debug1: Next authentication method: password
[ ... ]

Following this Q&A (the same person asking and answering), I tried:

$ ssh -v -o "PubkeyAcceptedAlgorithms=+ssh-rsa" theserver.com

And than worked, surprisingly enough!

[ ... ]
debug1: Authentications that can continue: publickey,password
debug1: Next authentication method: publickey
debug1: Offering public key: /home/theuser/.ssh/id_rsa RSA SHA256:LKkjdfrgkGfdjgKJ35AfNKJYL+fIQ
debug1: Server accepts key: /home/theuser/.ssh/id_rsa RSA SHA256:LKkjdfrgkGfdjgKJ35AfNKJYL+fIQ
Authenticated to theserver.com ([193.12.56.12]:22) using "publickey".
debug1: channel 0: new session [client-session] (inactive timeout: 0)
debug1: Requesting no-more-sessions@openssh.com
[ ... ]

Possibly add this line to /etc/ssh/ssh_config (or to your local ~/.ssh/config):

    PubkeyAcceptedAlgorithms +ssh-rsa

Why did it help? I have no idea. Even before the change in the config file, the query on the said variable listed ssh-rsa as one of the candidates:

$ ssh -Q PubkeyAcceptedAlgorithms

So not clear what happened here.

Posted Under: crypto,Linux,Software
This post was written by eli on February 4, 2016 Comments (1)

Linux Malware Detect for occasional non-root use

Intro

This is a minimal HOWTO on installing Linux Malware Detect for occasional use as a regular non-root user. Not that I’m so sure it’s worth bothering, given that contemporary exploit code seems to be able to go under its radar.

Background

One not-so-bright afternoon, I got a sudden mail from my web hosting provider saying that my account has been shut down immediately due to malware detected in my files (citation is slightly censored):

Hello,
Our routine malware scanner has reported files on your account as malicious. Pasted below is the report for your confirmation. Your account hosts old, outdated and insecure scripts which needs to be updated asap. Please reply back to this email so that we can work this out.

====================================
HOST: ——-
SCAN ID: 151230-0408.31792
STARTED: Dec 30 2015 04:08:40 -0500
TOTAL HITS: 1
TOTAL CLEANED: 0

FILE HIT LIST:
{HEX}php.base64.v23au.185 : /home/——/public_html/modules/toolbar/javascript21.php => /usr/local/maldetect/quarantine/javascript21.php.295615562
===============================================

I was lucky enough to have a backup of my entire hosted subdirectory, so I made a new backup, ran

$ find . -type f | while read i ; do sha1sum "$i" ; done > ../now-sha1.txt

on the good and bad, and then compared the output files. This required some manual cleanup of several new PHP files which contained all kind of weird stuff.

In hindsight, it seems like the malware PHP files were created during an SQL injection attack on Drupal 7 back in October 2014 (read again: an SQL injection attack in 2014. It’s as if a malaria breakout would occur in Europe today). The web host did patch the relevant file for me (without me knowing about it, actually), but only a couple of days after the attack broke loose. Then the files remained undetected for about a year, after which only one of these was nailed down. The malware PHP code is clearly crafted to be random, so it works around pattern detection.

Now, when we’re convinced that Linux Malware Detect actually doesn’t find malware, let’s install it.

Installing

There are plenty of guides on the web. Here’s my own take.

$ git clone https://github.com/rfxn/linux-malware-detect.git

For those curious on which revision I’m using:

$ git rev-parse HEAD
190f56e8704213fab233a5ac62820aea02a055b2

Change directory to linux-malware-detect/, and as root:

# ./install.sh
Linux Malware Detect v1.5
            (C) 2002-2015, R-fx Networks <proj@r-fx.org>
            (C) 2015, Ryan MacDonald <ryan@r-fx.org>
This program may be freely redistributed under the terms of the GNU GPL

installation completed to /usr/local/maldetect
config file: /usr/local/maldetect/conf.maldet
exec file: /usr/local/maldetect/maldet
exec link: /usr/local/sbin/maldet
exec link: /usr/local/sbin/lmd
cron.daily: /etc/cron.daily/maldet
maldet(15488): {sigup} performing signature update check...
maldet(15488): {sigup} could not determine signature version
maldet(15488): {sigup} signature files missing or corrupted, forcing update...
maldet(15488): {sigup} new signature set (2015121610247) available
maldet(15488): {sigup} downloading http://cdn.rfxn.com/downloads/maldet-sigpack.tgz
maldet(15488): {sigup} downloading http://cdn.rfxn.com/downloads/maldet-cleanv2.tgz
maldet(15488): {sigup} verified md5sum of maldet-sigpack.tgz
maldet(15488): {sigup} unpacked and installed maldet-sigpack.tgz
maldet(15488): {sigup} verified md5sum of maldet-clean.tgz
maldet(15488): {sigup} unpacked and installed maldet-clean.tgz
maldet(15488): {sigup} signature set update completed
maldet(15488): {sigup} 10822 signatures (8908 MD5 / 1914 HEX / 0 USER)

Reduce installation

Remove cronjobs: First /etc/cron.d/maldet_pub

*/10 * * * * root /usr/local/maldetect/maldet --mkpubpaths >> /dev/null 2>&1

and also /etc/cron.daily/maldet (scan through everything daily, I suppose):

#!/usr/bin/env bash
export PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:$PATH
export LMDCRON=1
. /usr/local/maldetect/conf.maldet
if [ -f "/usr/local/maldetect/conf.maldet.cron" ]; then
	. /usr/local/maldetect/conf.maldet.cron
fi
find=`which find 2> /dev/null`
if [ "$find" ]; then
	# prune any quarantine/session/tmp data older than 7 days
	tmpdirs="/usr/local/maldetect/tmp /usr/local/maldetect/sess /usr/local/maldetect/quarantine /usr/local/maldetect/pub"
	for dir in $tmpdirs; do
	 if [ -d "$dir" ]; then
	  $find $dir -type f -mtime +7 -print0 | xargs -0 rm -f >> /dev/null 2>&1
	 fi
	done
fi

if [ "$autoupdate_version" == "1" ] || [ "$autoupdate_signatures" == "1" ]; then
	# sleep for random 1-999s interval to better distribute upstream load
	sleep $(echo $RANDOM | cut -c1-3) >> /dev/null 2>&1
fi

if [ "$autoupdate_version" == "1" ]; then
	# check for new release version
	/usr/local/maldetect/maldet -d >> /dev/null 2>&1
fi

if [ "$autoupdate_signatures" == "1" ]; then
	# check for new definition set
	/usr/local/maldetect/maldet -u >> /dev/null 2>&1
fi

# if we're running inotify monitoring, send daily hit summary
if [ "$(ps -A --user root -o "cmd" | grep maldetect | grep inotifywait)" ]; then
        /usr/local/maldetect/maldet --monitor-report >> /dev/null 2>&1
else
	if [ -d "/home/virtual" ] && [ -d "/usr/lib/opcenter" ]; then
		# ensim
	        /usr/local/maldetect/maldet -b -r /home/virtual/?/fst/var/www/html/,/home/virtual/?/fst/home/?/public_html/ 1 >> /dev/null 2>&1
	elif [ -d "/etc/psa" ] && [ -d "/var/lib/psa" ]; then
		# psa
		/usr/local/maldetect/maldet -b -r /var/www/vhosts/?/ 1 >> /dev/null 2>&1
        elif [ -d "/usr/local/directadmin" ]; then
                # DirectAdmin
                /usr/local/maldetect/maldet -b -r /home?/?/domains/?/public_html/,/var/www/html/?/ 1 >> /dev/null 2>&1
	elif [ -d "/var/www/clients" ]; then
		# ISPConfig
                /usr/local/maldetect/maldet -b -r /var/www/clients/?/web?/web 1 >> /dev/null 2>&1
	elif [ -d "/etc/webmin/virtual-server" ]; then
		# Virtualmin
                /usr/local/maldetect/maldet -b -r /home/?/public_html/,/home/?/domains/?/public_html/ 1 >> /dev/null 2>&1
	elif [ -d "/usr/local/ispmgr" ]; then
		# ISPmanager
		/usr/local/maldetect/maldet -b -r /var/www/?/data/,/home/?/data/ 1 >> /dev/null 2>&1
	elif [ -d "/var/customers/webs" ]; then
		# froxlor
		/usr/local/maldetect/maldet -b -r /var/customers/webs/ 1 >> /dev/null 2>&1
	else
		# cpanel, interworx and other standard home/user/public_html setups
	        /usr/local/maldetect/maldet -b -r /home?/?/public_html/,/var/www/html/,/usr/local/apache/htdocs/ 1 >> /dev/null 2>&1
	fi
fi

And then remove the bootup hooks (I could and should have done this with chkconfig, actually):

# rm `find /etc/rc.d/ -iname S\*maldet\*`
rm: remove symbolic link `/etc/rc.d/rc3.d/S70maldet'? y
rm: remove symbolic link `/etc/rc.d/rc4.d/S70maldet'? y
rm: remove symbolic link `/etc/rc.d/rc2.d/S70maldet'? y
rm: remove symbolic link `/etc/rc.d/rc5.d/S70maldet'? y

Configuration

Edit /usr/local/maldetect/conf.maldet. The file is self-explained. The defaults are quite non-intrusive (no quarantine nor cleaning by default, no user suspension etc.). I turned off the automatic updates (I don’t run this as a cron job anyhow) and opted in scans by users:

scan_user_access="1"

Other than that, I kept it as is.

Preparing for run as non-root user

As a regular user (“eli”) I went

$ maldet
touch: cannot touch `/usr/local/maldetect/pub/eli/event_log': No such file or directory
/usr/local/maldetect/internals/functions: line 31: cd: /usr/local/maldetect/pub/eli/tmp: No such file or directory
mkdir: cannot create directory `/usr/local/maldetect/pub/eli': Permission denied
chmod: cannot access `/usr/local/maldetect/pub/eli/tmp': No such file or directory
mkdir: cannot create directory `/usr/local/maldetect/pub/eli': Permission denied
chmod: cannot access `/usr/local/maldetect/pub/eli/sess': No such file or directory
mkdir: cannot create directory `/usr/local/maldetect/pub/eli': Permission denied
chmod: cannot access `/usr/local/maldetect/pub/eli/quar': No such file or directory
sed: couldn't open temporary file /usr/local/maldetect/sedIuE2ll: Permission denied

[...]

So it expects a directory accessible by non-root self. Let’s make one (as root)

# cd /usr/local/maldetect/pub/
# mkdir eli
# chown eli:eli eli

Giving it a try

Try

$ maldet -h

And performing a scan (checking a specific sub-directory on my Desktop):

$ maldet -a /home/eli/Desktop/hacked/
sed: couldn't open temporary file /usr/local/maldetect/sedcSyxa1: Permission denied
Linux Malware Detect v1.5
            (C) 2002-2015, R-fx Networks <proj@rfxn.com>
            (C) 2015, Ryan MacDonald <ryan@rfxn.com>
This program may be freely redistributed under the terms of the GNU GPL v2

ln: creating symbolic link `/usr/local/maldetect/sigs/lmd.user.ndb': Permission denied
ln: creating symbolic link `/usr/local/maldetect/sigs/lmd.user.hdb': Permission denied
/usr/local/maldetect/internals/functions: line 1647: /usr/local/maldetect/tmp/.runtime.hexsigs.18117: Permission denied
maldet(18117): {scan} signatures loaded: 10822 (8908 MD5 / 1914 HEX / 0 USER)
maldet(18117): {scan} building file list for /home/eli/Desktop/hacked/, this might take awhile...
maldet(18117): {scan} setting nice scheduler priorities for all operations: cpunice 19 , ionice 6
maldet(18117): {scan} file list completed in 0s, found 8843 files...
maldet(18117): {scan} scan of /home/eli/Desktop/hacked/ (8843 files) in progress...
maldet(18117): {scan} 8843/8843 files scanned: 0 hits 0 cleaned
maldet(18117): {scan} scan completed on /home/eli/Desktop/hacked/: files 8843, malware hits 0, cleaned hits 0, time 253s
maldet(18117): {scan} scan report saved, to view run: maldet --report 151231-0915.18117

Uh, that was really bad. The directory contains several malware PHP files. Maybe the signature isn’t updated? The file my hosting provider detected was quarantined, and those that were left are probably sophisticated enough to go under the radar.

Update the signature file

Since I turned off the automatic update of signature files, I have to do this manually. As root,

# maldet -u
Linux Malware Detect v1.5
            (C) 2002-2015, R-fx Networks <proj@rfxn.com>
            (C) 2015, Ryan MacDonald <ryan@rfxn.com>
This program may be freely redistributed under the terms of the GNU GPL v2

maldet(15175): {sigup} performing signature update check...
maldet(15175): {sigup} local signature set is version 2015121610247
maldet(15175): {sigup} latest signature set already installed

Well, no wonder, I just installed maldet.

So the bottom line, mentioned above, is that this tool isn’t all that effective against the specific malware I got.

Posted Under: Drupal,Linux,Server admin,Software
This post was written by eli on December 31, 2015 Comments (3)

USB 3.0 is a replacement, and not an extension of USB 2.0

USB 3.0 is slowly becoming increasingly common, and it’s a quiet revolution. These innocent-looking blue connectors don’t tell the little secret: They carry 4 new data pins (SSTX+, SSTX-, SSRX+, SSRX-), which will replace the existing D+/D- communication pins one day. Simply put, USB 3.0 is completely standalone; it doesn’t really need those D+/D- to establish a connection. Today’s devices and cables carry both USB 2.0 and USB 3.0 in parallel, but that’s probably a temporary situation.

This means that USB 3.0 is the line where backward compatibility will be cut away. Unlike many other hardware standards (the Intel PC in particular), which drag along legacy compatibility forever, USB will probably leave USB 2.0 behind, sooner or later.

It took me quite some effort to nail this down, but the USB 3.0 specification makes it quite clear on section 3.2.6.2 (“Hubs”):

In order to support the dual-bus architecture of USB 3.0, a USB 3.0 hub is the logical combination of two hubs: a USB 2.0 hub and a SuperSpeed hub. (…) The USB 2.0 hub unit is connected to the USB 2.0 data lines and the SuperSpeed hub is connected to the SuperSpeed data lines. A USB 3.0 hub connects upstream as two devices; a SuperSpeed hub on the SuperSpeed bus and a USB 2.0 hub on the USB 2.0 bus.

In short: a USB 3.0 hub is two hubs: One for 3.0 and one for 2.0. They are unrelated. The buses are unrelated. This is demonstrated well in the following block diagram shown in Cypress’ HX3 USB 3.0 Hub Datasheet (click to enlarge).

Even though any device is required to support both USB 2.0 and USB 3.0 in order to receive a USB 3.0 certifications (USB 1.1 isn’t required, even though it’s allowed and common), USB 3.0 is self-contained. The hotplug detection is done by sensing a load on the SuperSpeed wires, and all other PHY functionality as well.

An important conclusion is that a USB 3.0 hub won’t help those trying to connect several bandwidth-demanding USB 2.0 devices to a single hub, hoping that the 5 Gb/s link with the computer will carry the aggregation of 480 Mbit/s bandwidth from each device. There will still be one single 480 Mb/s link to carry all USB 2.0 devices’ data.

Having said all the above, there is a chance that the host may expect to talk with a physical device through both 2.0 and 3.0. For example, it may have some functionality connected to USB 2.0 only, and some to 3.0, through an internal hub. This doesn’t contradict the independence of the buses, but it may cause problems if SuperSpeed-only connections are made, as offered by Cypress’ Shared Link (TM) feature.

But the spec doesn’t encourage those USB 2.0/3.0 mixes, to say the least. Section 11.3 (“USB 3.0 Device Support for USB 2.0″) says:

For any given USB 3.0 peripheral device within a single physical package, only one USB connection mode, either SuperSpeed or a USB 2.0 speed but not both, shall be established for operation with the host.

And there’s also the less clear-cut sentence in section 11.1 (“USB 3.0 Host Support for USB 2.0″):

When a USB 3.0 hub is connected to a host’s USB 3.0-capable port, both USB 3.0 SuperSpeed and USB 2.0 high-speed bus connections shall be allowed to connect and operate in parallel. There is no requirement for a USB 3.0-capable host to support multiple parallel connections to peripheral devices.

The same is said about hubs on section 11.2 (“USB 3.0 Hub Support for USB 2.0″), and yet, it’s not exactly clear to me what they mean by saying that the parallel connections should be allowed, but not multiple parallel connections. Probably a distinction between allowing the physical layers to set up their links (mandatory) and actually using both links by the drivers (not required).

So one day, it won’t be possible to connect a USB 3.0 device to an old USB 2.0 plug. Maybe that day is already here.

USB 3.0 over fiber?

These SuperSpeed wires are in fact plain gigabit transceivers (MGT, GTX), based upon the same PHY as Gigabit Ethernet, PCIe, SATA and several others (requiring equalization on the receiver side, by the way). So one could think about connecting these four wires to an SFP+ optical transceiver and obtain a fiber link carrying USB? Sounds easy, and maybe it is, but at least these issues need to be considered:

The USB 3.0 spec uses low-frequency signaling (10-50 MHz) to initiate a link with the other side. SFP+ transceivers usually don’t cover this range, at least not in their datasheets (it’s more like 300-2500 MHz or so). So this vital signal may not reach the other side properly, and hence the link establishment may fail.
The transmitter is required to detect if there’s a receiver’s load on the other side, by generating a common-mode voltage pulse, and measure the current. SFP+ transceivers may not be detected as loads this way, as they typically have only a differential load between the wires. This is quite easily fixed by adding a compatible signal repeater between the USB transmitter and the SFP+ signal input pair.
The transmitter will detect a load even if the other side isn’t ready (e.g. there’s nothing connected to the SFP+ transceiver, or the hardware on the other side is off). I haven’t dug into the spec and checked if this is problematic, but in the worst case, the other side’s readiness can be signaled by turning the laser on from the other side. Or actually, not turning it off. SFP+ transceivers have a “Disable Tx laser” input pin, as well as a “Receive signal loss” for this.
Without investigating this too much, it seems like this fiber connection will not be able to carry traffic for USB 2.0 devices by simple means. It’s not clear if a USB 2.0 to USB 3.0 converter is possible to implement in the same way that USB 1.1 traffic is carried over USB 2.0 by a multi-speed hub: As mentioned above, USB 2.0 is expected to be routed through separate USB 2.0 hubs. Odds are however that once computers with USB 3.0-only ports start to appear, so will dedicated USB bridges for people with old hardware, based upon some kind of tunneling technique.

Posted Under: USB
This post was written by eli on December 11, 2015 Comments (5)

Hierarchies in Orcad schematics: Please, don’t

It seems like hierarchies are to board designers what C++ is to programmers: It kills the boredom, but also the project. They will proudly show you their block diagrams and the oh-so-ordered structure, but in the end of the day, noone can really figure out what’s connected to what. Which is kinda important in a PCB design.

Not to mention that it’s impossible to tell what’s going on looking at the pdf schematics: Try to search for the net’s name, and you’ll find it in 20 places, 18 of which are the inter-hierarchy connections of that net. One of which, is maybe wrong, but is virtually impossible to spot.

On a good day, everything looks fine and in order, and noone notices small killers like the one below. It’s easy (?) to spot it now that I’ve put the focus on it, but would you really see this on page 23 of yet another block connection in the schematics?

Click to enlarge (this is from a real-life design made by a serious company):

So please, PCB designers, wherever you are: Look at any reference design you can find, and do the same: Just put the net names that belong to another page. Don’t try to show the connections between the blocks. They help nobody. If the net name is meaningful, we will all understand on which page to look for it. And if we don’t, we use our pdf browser’s search feature. Really.

Posted Under: electronics
This post was written by eli on December 10, 2015 Comments (3)

LVM volume resizing jots

These are my jots as I resized a partition containing an encrypted LVM physical volume, and then took advantage of that extra space by extending a logic volume containing an ext4 file system. The system is an Ubuntu 14.04.1 with a 3.13.0-35-generic kernel.

There are several HOWTOs on this, but somehow I struggled a bit before I got it working. Since I’ll do this again sometime in the future (there’s still some space left on the physical volume) I wrote it down. I mainly followed some of the answers to this question.

The overall setting:

$ ls -lR /dev/mapper
/dev/mapper:
total 0
crw------- 1 root root 10, 236 Nov 17 16:35 control
lrwxrwxrwx 1 root root       7 Nov 17 16:35 cryptdisk -> ../dm-0
lrwxrwxrwx 1 root root       7 Nov 17 16:35 vg_main-lv_home -> ../dm-3
lrwxrwxrwx 1 root root       7 Nov 17 16:35 vg_main-lv_root -> ../dm-2
lrwxrwxrwx 1 root root       7 Nov 17 16:35 vg_main-lv_swap -> ../dm-1

$ ls -lR /dev/vg_main/
/dev/vg_main/:
total 0
lrwxrwxrwx 1 root root 7 Nov 17 16:35 lv_home -> ../dm-3
lrwxrwxrwx 1 root root 7 Nov 17 16:35 lv_root -> ../dm-2
lrwxrwxrwx 1 root root 7 Nov 17 16:35 lv_swap -> ../dm-1

And the LVM players after the operation described below:

lvm> pvs
 PV                    VG      Fmt  Attr PSize   PFree 
 /dev/mapper/cryptdisk vg_main lvm2 a--  465.56g 121.56g
lvm> vgs
 VG      #PV #LV #SN Attr   VSize   VFree 
 vg_main   1   3   0 wz--n- 465.56g 121.56g
lvm> lvs
 LV      VG      Attr      LSize   Pool Origin Data%  Move Log Copy%  Convert
 lv_home vg_main -wi-ao--- 300.00g                                          
 lv_root vg_main -wi-ao---  40.00g                                          
 lv_swap vg_main -wi-ao---   4.00g

Invoke “lvm” in order to run LVM related commands (probably not really required)

Make LVM detect that I’ve resized the underlying partition (mapped as cryptdisk):

lvm> pvresize -t /dev/mapper/cryptdisk

Now to resizing the logical volume. Unfortunately, the Logical Volume Management GUI tool refused that, saying that the volume is not mounted, but in use (actually, I think it *was* mounted). So I went for the low-level way.

Under “Advanced Options” I went for a rescue boot, and chose a root shell.

Check the filesystem in question

fsck -f /dev/mapper/vg_main-lv_home

Back to the “lvm” shell. A little test, not the -t flag (making lv_home, under vg_main 200 GiB larger):

lvm> lvextend -t -L +200g /dev/vg_main/lv_home

It should write out the desired final size (e.g. 300 GiB)

Then for real:

lvm> lvextend -L +200g /dev/vg_main/lv_home

Oops, I got “Command failed with status code 5″. The reason was that the root filesystem was mount read-only. Fixing that I got “Logical volume successfully resized”.

But wait! There is no device file /dev/vg_main/lv_home

Now resize the ext4 filesystem

resize2fs /dev/mapper/vg_main-lv_home

And run a final check again:

fsck -f /dev/mapper/vg_main-lv_home

And rebooted the computer normally.

Posted Under: Linux,Software
This post was written by eli on November 17, 2015 Comments (0)

« Older Entries

Newer Entries »

Popular Posts

Latest Posts

Archives

Introduction

Overview

Clocking

QPLL vs. CPLL

Jitter

8b/10b encoding

Scrambling

TX/RXUSRCLK and TX/RXUSRCLK2

Word alignment

Tx buffer, to use or not to use

Rx buffer

Leftover notes

Introduction

Preparing the runs & OOCs

Building the OOCs

Finally: Implementing the project

Bottom line

The files

How it works

Making the remote host recognize you

Having multiple SSH keys

A successful session

… and then it didn’t work

Intro

Background

Installing

Reduce installation

Configuration

Preparing for run as non-root user

Giving it a try

Update the signature file

USB 3.0 over fiber?

Quick links

Categories

Meta