As I use Blender only occasionally, I’ve written down quite a few hints to myself for getting back to business. If this helps anyone else, so much better.
I’ve also written two similar posts on this matter: A general post on Blender and a post on 3D printing.
Bones
- For a simple beginner’s use example, see this page.
- Bones are simply a handle which one can do the Grab / Rotate / Size trio on. It has an pivot point and a handle. The manipulations on the bone apply to all vertices in the bone’s Vertex Group, relative to the bone’s pivot point, and in proportion to their weight for that group.
- The Vertex Groups are listed under the object’s properties, under Object Data (icon is an upside down triangle of dots). In Weight Paint mode, this is where the group to paint weights for is selected.
- The Vertex Groups’ names are taken from the bones’ when weights are assigned automatically.
- The Armature modifier is added (automatically) to the object subject to the bones. Be sure that it’s the first modifier (uppermost in the stack), in particular before Subdivision Surface. It’s the original mesh we want to move, not tear pieces of the rounded one. Corollary: The bones’ deformations can be applied, like any modifier.
- Always check the bones’ motion alignment with the parent bone, and set the bones’ Roll parameter (in the bones’ properties, icon with bone) if necessary (in particular if the previous segment has been resized). This sets the axis in space at which the bone rotates, and has to be done manually in Edit mode. It controls the direction the bone rotates w.r.t its origin, which is crucial for intuitive motion, so the bones seem to move right, but just a little off the desired direction. Just align the square of the bone symbol with the previous segment’s direction.
- The automatic weights aren’t all that good. In the end, there’s no way out but to assign the weights manually.
- And the Weight painting is good for getting a picture of what’s going on. But assigning weights with it is really bad. In particular as it’s easy to mistakenly paint a completely unrelated vertex, leading to weird things happening.
- Instead, set the weight manully under the Object data tab (just mentioned). Select the vertices in Edit Mode, write the desired weight in the dedicated place under the Object data tab, and click “Assign”.
- The Armature must be a parent of the object to be distorted. Extruded bones are children of the bones they’re extruded from.
- To move around the bones (in particularly rotate), enter Pose mode (or just click “Pose” for the relevant armature in the Object Outliner).
- Zero the pose: Change to Pose mode, select all (A) and Pose > Clear Transform > All
- The bones’ influence is disabled only in Edit mode (unless enabled in the Armature modifier).
- When an object controlled by bones is duplicated, the vertex groups are duplicated as well, but not the bones. So both objects are controlled by the same bones, in an non-natural way (center of rotation on previous bones etc.)
- If a vertex belongs only to one group, the weight is meaningless: If it belongs to the group, it will move 100% anyhow.
- If a vertex belongs to more than one vertex group. its normalizes the total to 1.0. So it’s fine to have an overlap on the joints, but be careful with pushing it too far. Note that the bone after the joint is moved by virtue of parenting, so there’s no reason to assign weights after the joint. But it will weaken effect that is supposed to move that part.
- Not that vertex groups that have no bone don’t count for proportional motion of the vertex. For a vertex that moves less than 100% on a single bone, also assign a second vertex group that belongs to a bone that will never move. This is good for transition with a fixed part.
- Rotate bones with Individual Origins pivot point.
Textures etc.
- Each face is related to a material. The first material is assigned to all faces. Additional ones need to be assigned.
- Once a material is selected in the Material property button-tab, the Texture tab relates to it.
- Projecting an image: First mark a seam in Edit mode. Select a set of edges and Mesh > Edges > Mark Seam. Then select the faces to project (possibly all) and go Mesh > UV Unwrap… > Unwrap (or possibly Project from view or some other choice).
- When using UV projection, the Type is “Image or Movie”, the Source is the file, and under “Mapping” it says Coordiates: UV (otherwise the mapping in Material view will be wrong).
- UV/Image Editor: Maps pieces of the image into faces. Use side-by-side with a 3D view in Edit mode. Enable “Keep UV and Edit mode mesh selection in sync” for easy selection (somewhere in the middle of the bottom bar). The mouse’s middle button + move mouse moves the image view (instead of Shift-scroll or something)
- Multiple images can be sources for a single object, by virtue of generating multiple materials, and assigning them them to difference faces. Each material is then linked to separate textures, each based upon a different image.
- Texture paint: A little GIMP, just in 3D. The changes are updated in the source images image(s). The big upper box is the brush selector. Most notable is “Clone”, which works like GIMP’s, with CTRL-click to select the source. Excellent for hiding seams.
- Careful with overlapping UV mappings on a single image with Texture Paint: One stroke will affect all mapped regions.
- Texture paint may manipulate several images in a single stroke, if this stroke covers regions sourced from different images.
- If texture paint is responding slowly and eating a lot of CPU, try reducing the subsurface division number, if used. Too many faces aren’t good.
- Don’t forget to save the 2D images in the end!
- For copying a 3D shape from a 2D image, use Global Mapping on the texture, along with a Top Ortographic view. The texture remains in place no matter how the object it twisted and turned, so it’s fairly easy to drag it along the image’s edges.
Rendering
- F12: Render Image (“Quick Render”). Also from top menu Render > Render Image. Return to 3D view with F11.
- Shading Smooth / Flat at the Tool shelf doesn’t change the shape, but only the way light is reflected
- If the rendering result suffers from weird shadows, and/or unexplained edge lines on a surface that’s supposed to be smooth, try in Edit Mode go Mesh > Normal > Recalculate Outside, which may fix normals that have been messed up from edits.
Cycles: How it works
If a realistic rendering result is desired, forget about Blender’s native render engine. It’s a lost battle with dirty tricks to achieve the obvious way to reach a natural appearance, and that’s to simulate the light rays. Which is what the Cycles engine does.
This is a very simplistic description of Cycles. In reality, it’s by far more clever and efficient, so the results on the real engine are better than you would expect from the description below.
For each sample (i.e. an iteration of improving the rendered image), and for each pixel to be generated on the rendered image, the render engine traces the light ray, backwards. That is, from the camera to the source of light.
The initial leg is simple, as the angle of view is known and deterministic. If this ray hits nothing, we get black. If it hits a face, it examines its material data. By hitting something, I mean the first intersection between the ray’s line and some face in the mesh.
When hitting a face, the face’s material’s shader is activated. If it’s a pure emission of light, that’s the final station, and the pixel’s value can be calculated. If it’s any other shader, it will tell the render engine on what angle to continue, and how to modify the light source, once reached. This modification is the material’s color or the texture at the specific point that was hit.
And so it goes on, until a ray hitting nothing is reached, or a pure emitting light source. Once the final station has been reached, the aggregation of color modifications is applied, and there’s the final pixel value.
So why is it randomness involved? Why is it random?
A diffusing surface collects light from all directions, and reflects it towards the camera. Since the light tracing can only follow one direction, it’s picked at random by the shader. So each sample consists of one such ray trace for each pixel. Each time a diffusing surface is reached, there’s a lottery. Hence the randomness. Except for pass-through and purely reflective shaders (i.e. Glossy with Roughness 0), which have a deterministic ray bending pattern.
When the “Mix” shader is used, the mix rate is a real mix: Each shader gets its go, and the result is mixed. Try to mix an emission shader with a black diffusion.
So God may not play with a dice, but Cycles surely does.
Light is Everything
- DON’T use Blender’s Lamps unless you want everything to look like plastic. There’s a huge difference between lamps and objects (e.g. planes) with an emission shader (both in results and render time). Use the latter for a realistic look.
- In particular, a skin texture will never look right with lamp light. See below.
- Creating an invisible light source: Create any object, set its shader to Emission, and go to the “Object” properties (the icon is a yellow cube). At the bottom, there’s “Cycles Setting”. Disable “Camera” checkbox in the Ray Visibility section.
- To avoid seeing these emission objects when editing (they get in the way all the time), put them in a different layer. Use Ctrl-click on the relevant layer to view it along with the current one when switching to render view.
Node Editor
- For an texture image: Add an Image Texture element, and open the file. Then to UV mapping (nothing will be visible before that). If there are multiple texture files, they are all mapped with the same UV map by default (or at all?).
- Bump map: Image Texture > Bump (input Height) > Diffuse BSDF (input Normal) > Material Output (input Surface). Displaces the position along the normal, “Distance” says how much. With “Invert” unchecked, a high image value means outwards.
- Use an image’s transparency: Generate a Tranparent BSDF shader, and connect it to a Mix Shader’s upper input. The lower input goes to the regular (Diffuse BSDF?) shader. The Image Texture’s Color goes as usual to the regular shader, but its Alpha output to the Mix shader’s Fac.
- Glossy BSDF: Mirror-like reflection when Roughness is set to zero, otherwise it’s diffusing the reflection.
- Velvet BSDF: Low angles between incident and reflection yield low reflection, so it emphasizes smooth contours. Good for combination with Diffuse shader for simulating human skin (compensate for too dark edges of the latter).
- Emission: Not just as a light source, but also a way to fake fill light.
- Color Ramp: Useful to turn an image into a one-dimensional range of colors, including Alpha, instead of manipulating the texture’s range.
- The Geometry input supplies Normal (which is after smoothing, pick True Normal for without) and Incoming (which is the direction of the light ray). Along with Converter > Vector Math set to Dot Product or Cross Product, the value output with these to combined depends on the angle between the incident ray and the normal. Together with Color Ramp, this allows an arbitrary reflection pattern (use for Fac on some Mix shader).
- The Voronoi texture (using “Cells”) is great for simulating an uneven, grainy surface.
- To get a generally misty atmosphere, go to the World tab in Properties, and under Volume select Volume scatter with white color and Density of 0.1 to 0.2. Anisotropy should be 0.
Achieving human skin appearance
Making a model look human and alive is the worst struggle of all. I’ve seen a lot of crazy attempts to add complicated shaders and stuff to reach a natural skin appearance. Even though I haven’t managed to get a face look natural (good luck with that), these are the insights I have reached.
- Rule zero: Use Cycles. Should be obvious.
- Rule number one: DO NOT USE LAMPS. All generation of light should be done with objects (most likely flat planes) with (white) emission shaders. Any inclusion of lamp objects makes everything look like plastic. Rendering convergence is indeed faster with lamps, but the result is disastrous, even when lamps are used for just fill light. In short, create real studio lighting.
- There’s no need for subsurface scattering and all those crazy shaders. These are a result of the impossible attempt to tweak the reflection to get something realistic in response to the plastic feel of lamp light. When the light is done properly, plain shaders are enough. Actually, Subsurface Scattering makes a marginal difference, and to the worse (deepens shadows, while actual skin somehow reflects in all directions).
- The Glossy part of flat skin (e.g. a leg) should be GGX (default) with roughness ~ 0.5. Diffuse with roughness 0.4 (doesn’t matter so much), mixed 50/50. Use the texture’s color for the Gloss shader as well (or mix partly with white).
- And here’s the really important part: Natural skin is full with small bruises and other uneven coloring that we barely notice when watching with the naked eye. It’s when this uneven coloring is gone (a woman wearing tons of makeup or a 3D model) that it looks like plastic. Therefore, the texture applied on the skin area (i.e. the coloring of the faces) should be aggressively uneven, with speckles and also wide areas of slight discoloring. Adding a leathery bump texture and/or wrinkles adds to the realistic look, but won’t get the plastic feel unless the lighting is done right and the texture is alive.
- For the depth pattern of the skin, either use the Voronoi texture (see this page) on leather, or consider looking for images of elephant skin or something (the cell texture is similar). This is mainly relevant if closeups are made on the skin.
- Realistic eye: Be sure to add a cornea to the eye, mixing 90% transparent and 10% glossy shaders. The cornea’s ball should be 66% of the size of the eyeball, and brought to cover a little more than the iris. The reflection of the cornea brings the eye to life.
Animation
- Animation adds an Animation object to the controlled object’s hierarchy (with a ArmatureAction sub-object for Armatures).
- Key = Nailing the some properties some object for a given frame.
- Don’t expect to change the pose and have all changes recorded.
- Rather, in the Timeline Editor, select the desired bones of in the armature for keying (all bones of the armature, probably), pick which properties are being keyed (possibly just Rotation for plain motion) and click the key icon (“Insert Keyframe”).
- Keying Set = The set of objects whose properties are being keyed.
- Dope Sheet: Accurate, concise and gives control. Each channel is a property, each diamond is a key for that property. Thick lines between diamonds show that they haven’t changed along that time.
- Selection of keys: With right-click. Selecting the top diamond (“Dope Sheet Summary”) selects all keys of a frame (the Armature’s diamond selects all keys of an armature etc.)
- It’s possible to Copy-Paste keys with the clipboard icons at the bottom (or simply CTRL-C, CTRL-V). “Copy” relates to just the selected keys.
- In the Dope Sheet, use Shift-D and then G (grab) to copy all keys to another frame. Also possible to just Grab keys to adjust the timing etc.
- “Insert Keyframe” = store the properties of the current pose in the current positions. In Timeline Editor, this adds diamonds in the channels that correspond to the selected bones (or adds these channels). It doesn’t change or delete keys for bones not selected.
- Work flow: First, select the properties that are going to be involved (all bones of an armature?), and create a key for them in the Timeline Editor. The rest of the work is done in the Dope Sheet: Scrub to the desired frame, change the pose, and Key > Insert Keyframe > All Channels (or with I). Or possibly just selected channels, to leave the other channels interpolating as before.
- Note the difference between how Timeline and Dope Sheet stores the pose: Timeline stores the properties of the selected bone only, while the Dope Sheet allows storing “All Channels”. Assuming that all relevant properties have channels in the Dope Editor (it’s a good idea), “All Channels” captures the entire pose (and marks those that haven’t changed).
- Careful with jumping in time by accidentally clicking in the Timeline / Dope Sheet: It overrides all changes in the pose. To avoid this, “save your work” by “Inserting Keyframes” often.
- Don’t forget to move to a new frame before working on the next pose. If you do, copy the current frame’s keyframes into the clipboard, create a new keyframe with the current pose, and paste the previous keyframes into a slightly earlier frame. And then move (grab) the keys in time to their correct places.
- It’s possible (but usually pointless) to set the interpolation mode in the Dope Sheet (Key > Interpolation Mode). This controls the interpolation of the selected key until the next one. The default (set in User Preferences > Editing) is Bezier, which gives a natural feel.
- However the “Constant” interpolation can be useful for camera properties, when it’s desired to hold it still and then jump to other parameters (i.e. a “cut”).
Simulation
- Plain Physics fluid (simple example): It’s a 3D-grid based simulation running in a limited space, which is enclosed by the object to which the Physics > Fluid physics is attached with the “Domain” type (it’s the walls of the contained as well as the limits of the simulated region). The Physics properties of this object are those determining the simulation (in particular the time scale in seconds via the End time, and the real-life size in meters). And the baking is done on this object. Other objects, which have the Physics > Fluid attached will participate according to the Types, e.g. Fluid (the object will turn into a fluid) or Obstacles (which limits the motion of the fluid).
As I use Blender only occasionally, I’ve written down quite a few hints to myself for getting back to business. If this helps anyone else, so much better. Except when said otherwise, the notes here relate to version 2.79b (which is a different interface than what is common today).
I’ve also written two similar posts on this matter: A post on 3D printing and a post on rendering and animation.
Random notes
- First off all: File > User Preferences > Input and set Orbit Style to Trackball (and not Turntable). The default (Turntable) keeps the Z axis always up, which is extremely limiting for proper modeling.
- For a realistic look, go for the Cycles render engine. Otherwise (in particular simplistic animations and modeling), stay with Blender’s original.
- Window > Duplicate Window for a new window with the same project, which can be organized completely differently (the content, modes, selections etc. remain in sync, but not the window layout). If you save the project, all windows will be opened the next time the project is opened.
- Always pay attention to possible hints at the bottom of the 3D view, in particular in interactive operations with the mouse.
- Enable / Disable cursor and grid in 3D view: In the Properties tool shelf, under Display toggle “Only Render” check box.
- Adding a mesh in Object Mode adds an object. Adding a mesh in Edit Mode adds the mesh to the current object.
- 3D graphics can be defined by meshes or NURBs. Both methods are available in Blender, but keep in mind that it ends up as a mesh when exported to STL for printing.
- For easier interface, pick File > User Preferences, select Interface tab and enable “Rotate Around Selection”. Otherwise the 3D viewport’s rotation is pretty annoying.
- Vertices = points. Edges = lines between points. Faces = 2D planes between edges + a normal vector defining its direction (see Wikipedia). If the vertices of a face aren’t coplanar, it’s drawn as separate triangles.
- There’s an “Object” button on the Properties subwindow, containing the properties of the Object: Position, rotation, locks, transparency, you name it.
- When quitting blender, the current design is saved as quit.blend. Use File > Recover last session to resume next time.
- Local coordinates are applied relative to the object’s parent coordinates, so there’s a tree of coordinate displacements.
- Pay attention to the view type, as stated at the 3D view’s top left. In particular, if it’s Local view, only the selected objects are seen. Note the plural. It’s possible to select several objects and see them all, but Edit Mode only applies to one of them.
- When drawing a mesh for a smooth surface, keep it uniform; don’t make dense extrusions to catch the details, but fix that at a later stage. See “Subdivision Surface” below.
- There’s a Manipulate only center of points icon at the bottom of the 3D view: Good for rotating or scaling several object as a way to only move them, but if it’s on, manipulating a single object does nothing.
- An object may contain meshes with no connection between them.
- “Linked” = connected through vertices (what you’d naturally call a “thing”).
- It’s a good idea to remove double vertices every now and then: Edit mode, Vertex selection mode, select all vertices, Mesh > Vertices > Remove Doubles. This can be a result of an Extrusion canceled with Esc. Do this in particular if Subdivision Surface creates some ugly stuff for no apparent reason. Also try Mesh > Clean Up > Delete Loose.
- There are several constraints that can be applied. Some are self-related, and some to other objects: One object’s transforms are copied to another, one limiting the other. Not all are applicable for the simple (e.g. non-Game Engine) use.
- In order to view objects with transparency (with Cycles), viewport shading should be Material, and mix a diffuse shader with a Transparent Shader in the material (in the Node Editor). In the material tab of the object, set Viewport Alpha to “Alpha Blend”.
- In the Properties column, there’s a section for “Mesh Analysis” which paints different faces depending on various criteria. For example, find intersections, sharp regions etc.
Checklist when weird things happen
In Edit Mode, with all selected
- Mesh > Clean up > Delete Loose
- Mesh > Vertices > Remove Doubles
- Mesh > Normals > Calculate Outside
If there are sharp spots or a point getting buried, reduce subdivision surface to zero, and then rise it gradually. Look for
- A face that shouldn’t be there (in particular an internal face)
- An edge to a far point
- Two very close vertices that appear to be one
- A double vertex, edge or face (these are the most difficult to spot). Possibly by selecting by region in Wireframe view, and verifying that the correct number of elements are selected.
Cheat sheet
- Spacebar = search functions. The ultimate cheat.
- Use N and T buttons to toggle visibility of the properties bar and Toolshelf, respectively.
- Fetch an object (or other resource) from another Blender File: Top menu > File > Append and find the relevant object by its name (it’s good to have then named properly…)
- Select and move around stuff: Right-click on object and move mouse. Left-click to fix in place. Also use G key (Grab).
- Use W to toggle between selection modes (box, circle, lasso or tweak).
- There’s also Border Select and Circle Select (under the Select menu). Pay attention to the “Limit Selection To Visible” button next to the Vertices/Edge/Face selection trio buttons.
- Left click moves 3D cursor. Used as the landing point for added objects and as a pivot point if so chosen. In Solid 3D viewport shading, the depth element of the cursor’s position is where a ray of light would have hit a an object as visible (or somewhere near, if there’s no object on the way). This is slightly inaccurate (a tolerance of 1/10000 measure units or so) so it may be better to use Snap > Cursor to active.
- Tilt-rotate view (just like a 2D image): Shift-Ctrl scroll mousebutton
- To rotate view around 3D cursor: In the Properties shelf (press N), View > Lock to Cursor
- Zoom in and out: Scroll button
- Move scenery: Ctrl-scroll or Shift-scroll. Or: Shift-hold middle button and move mouse.
- Rotate scenery: Press scroll button and move mouse around
- Local View is extremely useful when the scenery becomes full with object (in particular light emission planes): Select the object to work with, and press numpad “/” (or View > View Global / Local) with the cursor on the relevant 3D view pane. This modification is relevant only to the certain pane, so other 3D views remain intact (important when some show render previews).
- “View Selected” (numpad “.”). Puts the selected item in the view’s center instead of trying to get that manually. Useful when selecting from object hierarchy.
- “Hide selected” (H) and unhide all (Alt-H). Get things out of view, in particular in Edit Mode (hide certain faces so one can see through). Works the same in Object Mode, but it’s equivalent to toggling the eye icon in the hierarchy.
- Delete stuff: X gives a menu
- Add objects: Add menu at the bottom left. Note that in Edit Mode, only meshes can be added, and the mesh is added to the selected object (not as a separate one!), as if it was a separate object Joined into one.
- Manually setting coordinates: Press N and look in the Transform submenu. “Local” coordinates means relative to the object’s own origin, and it’s quite useful.
- Setting an object’s parameters immediately after Adding it: Press T.
- Note the Object Mode vs Edit Mode at the bottom left: Selection of objects is possible only on Object mode.
- Modifiers: The wrench icon in the Properties pane (usually to the right). Can be stacked up, turned on/off momentarily, so don’t necessarily apply them right away.
- The common editing is done in 3D view (note the small selection boxes to the left, close to the bottom).
- View modes: At the bottom, next to “Object / Edit Mode”: Usually Solid, but Wireframe is informative, and Rendered can be nice (involves light)
- View menu: Useful for swapping Orthographic / Perspective view, and also to view from bottom, top, side etc. Most of these accessible from Numpad (see hints in menu).
- Make an object the center for rotation and scaling: View > View Selected
- Specials menu: W (for subdivide, which allows e.g. subdividing a face)
- Arbitrary vertices with edges between them: Enter Edit Mode for an object (possibly a dummy one, which is immediately deleted). Pick Vertex selection mode, and add vertices with CTRL+left mouse button. Useful along with an Empty Image (which can be semi-transparent) as a reference image.
- To make a 2D object -> 3D, possibly spin it (see Tool Shelf).
- Duplication of object: The Array Modifier (under “Generate”). For 2D/3D duplication, just cascade the modifiers.
- Mesh > Edges > Bridge Edge Loops is good for filling gaps. Use along with Edge Loops, or if it fails, select the entire body (with its opening) and pick Select Boundary Loop.
- But even better, use Dissolve rather that Delete for getting rid of Vertices / Edges / Faces, so the holes aren’t created in the first place.
- The Mirror modifier makes it easy to create symmetric objects. When used with Subdivision Surface, be sure to set align all edges on the symmetry plane: Either by scaling to zero with the symmetry plane as the pivot point, or use the Boolean modifier against a large cube, or use the Shrinkwrap modifier against a cube with the outer vertices belonging to the effective vertex group.
Distorting objects
- Scale, Grab, and Rotate: In Edit or Object mode, select an object and press S, G and R respectively. Or add X, Y, or Z for a constrained rotation and scaling (e.g. SX or shift-X for only ZY). One keystroke for global axis, the second for local axis. Also use numbers (e.g. S0.5 and R90). Hold down shift for precision.
- Alternatively (sometimes better): Enable the transformation manipulators with the colored axis icon at the middle-bottom of the 3D view window. Then pick the type of manipulation (translate, rotate or scale) and the axis context (global, local or others). This allows for a simple manipulation across one axis (drag the manipulators).
- Change the Pivot Point (at the bottom of 3D view) for one-side scaling or rotating around something else than the center.
- To change the Origin of the object (for scaling or rotating), pick Object > Transform > Origin to… (e.g. 3D cursor). Only in Object Mode.
- Align vertices to a plane or line: Move the 3D cursor to the desired position, set the Pivot Point to the cursor, select the vertices to align. Then choose S with one of the axes, and press the “zero” button — scale to zero = no distance.
- Moving stuff: Change to Edit Mode (Bottom left menu). A few items to the right, there’s what to select: Vertex, Edge or Face select. Choose Edge or Face. Select an Edge or Face and move it around (with G or right-click). The displacement is two-dimensional, on the viewed plane.
- There’s also Mesh > Edge Slide which is good for moving around an Edge loop (to give a subdivided surface emphasis on the right place)
- There’s Select > Snap to Cursor and Snap to Cursor (Offset) which allows to move stuff to an exact position (the cursor can be moved to a selection prior to this).
- Extrusion ( = Duplicate vertices, add edges between previous and duplicated vertices, and move the selection): Select a Face and press E, then move the new face on the perpendicular axis. Note that if it’s canceled with an ESC, the four new vertices remain, glued to the original face. Use CTRL-Z to get rid of them.
- Extrusion with snapping: Press CTRL while moving mouse.
- Duplicate a face, connected: Extrude, move around, and press ESC. This allow, for example, scaling the new face and possibly moving it, or extruding it again.
- If more than one face is selected, all are extruded together.
- If Sculpt mode is going to be used, consider the Multiresolution Modifier, which is the same as Subdivision Surface, but allows allocating a different figure for sculpting.
- The Bevel modifier rounds off corners (slightly). The number of elements is crucial.
- The “Adjust Edit Cage to Modifier result” (rightmost button) allows deforming the modified (i.e. smoothed) wireframe. This is actual sculpture. This requires a proper division of faces to begin with.
- Alternatively, and probably because of a poorly constructed mesh with too many vertices, use Proportional Editing Mode, which forces changes to vertices within a region (button next to “select faces” button at the bottom of a 3D view, in Edit Mode). Use G to grab a vertex or whatever, and mouse wheel to enlarge (roll downwards, counterintuitively) or diminish (roll upwards) the region of influence. Size of influence region is given as a number at the bottom. There are various patterns of how the neighbors are influenced.
- There’s also the Hook modifier, doing the same as proportional editing.
- To add loops of vertices: CTRL-R in Edit mode, and roll the scroll button to get several loops. Left click to confirm where the loop goes. Or subdivision edges (under Mesh > Edges).
- To cut an object in two: Add a loop with CTRL-R, and then Mesh > Vertices > Rip.
- Or just cut: Select a few vertices, and Mesh > Vertices > Rip. It duplicates the vertices, but doesn’t connect them with edges.
- Giving thickness to a mesh: Mesh > Faces > Solidify (there’s also a Solidify modifier). For a convex mesh (it most likely is), set the offset to 1, so that the extra layer goes outwards (in the direction of the normals). Otherwise there will be overlapping faces around sharp corners. Take a close look around sharp corners, and check the normals before applying it.
- To make sharp ends, extrude a face, and Mesh > Vertices > Merge (in Edit Mode, of course).
- Free bending objects: Create a Bezier path (Add > Curve > Path) and add the Curve modifier (under distort). The object is bent along the curve
- Closing small gaps between different objects (e.g. shoe to lower leg, or a drop of water slipping down on a surface): Apply the Shrink Modifier on the outer object (nearest Surface point is probably best), but only to a vertex group belonging to the interface region. Weight painting is useful here. Note that there’s no real meaning to “shrinking” — this is just vertices being glued to each other.
The Subdivision Surface Modifier and Catmull-Clark
- If a smooth surface is desired, this is very likely to be used.
- Watch this video. Really. Also Pixar’s page on this subject.
- The Subdivision Surface modifier smooths the object by cutting each edge into two for each subdivision round, hence multiplying the number of faces by four. The mesh turns into a quads-only mesh after the first round. If there are non-quads in the original mesh, artifacts may occur on this first round, when non-quads (in particular n-gons with an uneven n) are split into quads.
- Use quads whenever possible, and avoid vertices with more than 5 edges. Important exception: When the mesh consists of sparse “anchor points” and not a detailed outline of the desired result.
- The positions of the added as well as moved vertices are weighted averages of a set of neighboring vertices. This causes an eroding effect on corners, turning a cube into a sphere.
- Works best on uniformly spaced quadrilateral-faces meshes. Don’t triangulate, be careful with double vertices (e.g. from a poorly aborted extrusion) and avoid changing density of the mesh (e.g. for capturing some detail).
- Keep the mesh minimal. It’s always possible to add vertices later. Each vertex of the mesh is a handle for deforming the curvature. Have too many of them, and it will be difficult to get a naturally smooth shape. Try to put the vertices where smooth peaks and valleys are expected, even subtle.
- Keep the mesh minimal II: If there are small dents, rather make them as a texture with the Displace modifier, based upon a texture image (which will naturally share UV mapping with the material’s texture). This modifier should be inserted after the Subdivision Surface modifier, so it moves the final, rounded mesh. It may require a large number of subdivisions to get a smooth displacement, but that’s only necessary on the final rendering. This way, the dents are kept where they’re required, not where they were an accident.
- Keep the mesh minimal III: If natural motion based upon bones is desired, it’s crucial to draw a (possibly curved) line of vertices that move along with the bone, and another line of vertices that stay in place. The faces between these two lines will do the stretching, and they must be laid out in a natural way, i.e. have the geometry of the skin surface that does the stretching in real life.
- Sharp shapes will generally shrink. It will shrink less where the mesh is denser, because the averaging is done on the neighboring vertices, even if they’re close.
- It’s possible to manipulate the move a smoothed surface in Edit mode, with effect on the original mesh with the modifier.
- The Crease property of certain edges increases their weight in the average for calculating the vertices’ positions, ranging from 0 to 1. This makes the edge sharper after subdivision. When applied to a face, the resultant form gets close to the face (a cube turns into a cylinder if opposite faces have crease set to 1.0). Select the relevant face (in Edit mode), and press Shift-E. Or set manually in the Transform bar (Visible with N). Creasing makes the edges redder.
- The best way to copy a smooth shape from a 2D image is to start with a subsurfaced low-count mesh, and match the smoothed shape with the 2D image’s. Then possibly apply one round of subdivision, and fine-tune. Use the crease property for sharp turns rather than adding edges if possible.
- Unwanted creases can be a result of double edges. Select entire mesh, and go Mesh > Vertices > Remove Doubles and Mesh > Clean Up > Delete Loose.
- Dents can be a result of an uneven mesh. Consider using Edge Loops and Dissolve Edges to get it lighter.
- To create a sharp corner (e.g. the corner of eyes) or a pointy surface, extrude a vertex (creating an edge) and pull the vertex away from the surface. The edge, which is connected to nothing, pulls the surface towards the vertex. This edge is invisible during rendering, and isn’t cleaned up by “Delete loose” etc.
- For the fine details, consider the Mutiresolution modifier, which allows sculpting on top of a cruder mesh (the former for the coarse form). Or something similar?
Joining / subtracting objects / making holes
- Select two objects, and then CTRL-J. Can be separated again: Press P (Separate) in Edit mode and choose By loose parts
- Fusing / subtracting objects (best done in Wireframe view): Select one object, pick Properties > Wrench > Add Modifier > Boolean. Within there, select Union, Difference or Intersect. It’s always with respect to another object. The “picker” object in the little Object window allows selecting the other object in the view. Once it’s fine, pick Apply.
- The Boolean modifier messes up the mesh with duplicate vertices, and even worse, duplicate edges: Adjacent faces along the intersection may not share an edge, but instead have one of its own each. This is fairly acceptable with 3D printing, but creates warnings. Boolean is best used with simple objects for cutting. For example, a large cube to cut off parts of the modified object.
- Fusing, the right way I: Remove the intersecting faces manually, select the hole’s edges at both sides by selecting one edge for each and use Select > Edge Loops (in Edit Mode only). And then Mesh > Edges > Bridge Edge Loops.
- Fusing, the right way II: If there are a lot of intersection points between the objects, use the Boolean modifier with the Union option to fuse two objects together. The objects be manifold (it may work even if not, though). It is then required to check all intersection points manually in order to ensure that they were handled correctly. This usually requires use the “hide face” feature to see what happens inside. In particular look for faces at the crossing points that should have been removed (because both sides of it are “inside”), but they are still left there. Just delete these faces.
- The Knife tool (in the toolshelf) allows drawing straight lines, which creates new edges at the cutting points, and subdivides faces. The start and ending points are somewhere along edges, and not on vertices.
Not clear why, but even though vertices are highlighted, there’s no cut there. To get the cut from a vertex, first cut an adjacent edge nearby, and then merge the new vertex into the old one (alt-M) (stroke out this comments, because sometimes it’s true, sometimes it isn’t).
- Activate Cut-Through with Z. This makes a hole in both ends.
- There’s also Knife projection, which allows cutting a hole with a curve. Works with simple patterns: Select the two objects in Object Mode (the object to cut second), enter Edit Mode, set the viewing angle and Click Knife Project. I didn’t mange to enable the cut-through option for this, as I didn’t find the relevant checkbox on Blender 2.79.
- Both cut-through and projection depend on the 3D viewing angle.
- To punch a hole through a 3D body, set the shape with a 2D mesh or curve, and cut the shape on the faces on both sides. Then select the 2D holes on both sides, and pick Mesh > Edges > Bridge Edge Loops to draw edges across the body. The faces make a nice 3D hole.
- To create an internal hole, generate the shape of the hole as a separate object, and then (in Edit Mode) Mesh > Normals > Flip Normals. Join this object with the target, and place it as desired.
Notable modifiers
- Generate / Boolean: Create an object that is the intersection, union or difference between two objects. Excellent for chopping of a corner or even larger cuts by applying it with a large cube. Doesn’t work all that well when two complex meshes are involved, and the resultant mesh is often quite messy, and may need some work, in particular if it’s due for further manipulations. For 3D printing, this mess if often good enough, except for simple fixes.
- Generate / Array: The way to duplicate an object.
- Generate / Mask: Make all vertices belonging to a vertex group (or not belonging to a vertex group) invisible, both for render and editing. Extremely useful for working with complex structures, allowing to focus on certain parts (possibly internal) parts of a mesh (instead of hiding them every time).
- Generate / Skin: Creates a body around edges (which function as an armature). A quick way to create an arbitrary 3D shape.
- Generate / Wireframe: Gives thickness to the existing wireframe, making it a body. A bit like Skin, but simpler.
- Generate / Triangulate: Make all faces triangles
- Deform / Displace: Displace the vertices’ position as a function of a texture (i.e. an image). Useful for “printing text” on an object or making a bumpy surface with a certain pattern.
- Deform / Laplace Deform: Allows deforming an object while preserving geometric properties
- Deform / Shrinkwrap: As its name implies: pushes the vertices towards the exterior of another object, after applying its modifiers. Neat to get rid of overlapping meshes, but careful with the corners. The “Nearest Surface” (default) mode should be chosen for simple use. “Nearest Vertex” in conjunction with Vertex groups is just a way to glue vertexes from different objects together, but it’s not necessarily useful. Try both.
- Deform / Simple Deform: Allows freehand twisting and bending and other deformations. Another (empty) object’s location, size and orientation gets different effects.
Sculpt mode
- Excellent for fixing small dents with the “Smooth” tool (those dents that shouldn’t happen in the first place, because of a properly designed mesh…).
- Also good for small finishes, but requires a dense mesh to work with. Use when editing with Subdivision Surface isn’t good enough.
- Note the Brush > Sculpt Tool. “Draw” pulls up the mesh a bit (as if adding material) but if “Subtract” is chosen on the toolshelf, it dents inwards.
- Another Sculpt Tool is “Scrape” which is good to selectively round off corner (like Bevel, just not globally).
Other modes
- Vertex Paint: Simple, intuitive painting of the faces. For this to appear on render, a material must be assigned, and the “Vertex Color Paint” option must be checked (this isn’t the default). As expected, the paint follows the faces if they’re moved.
- Weight Paint: An intuitive way to mark the weight of a vertex group. This has to do with Vertex Groups, an important concept for moving parts of a body along with a “bone”, as well as the Shrinkwrap modifier and other stuff. The painting applies to vertices, therefore use with subdivision and other modifiers off, and hit the right points. Subdivision surface interpolates the weights across edges between points.
- Texture Paint: A more difficult way to paint a 3D model, but it generates a texture image one can save back. Useful for marking what goes where on the texture image, for writing back and then edit the image. Instead of fiddling with the UV mapping. Also for smoothing seams (with Clone brush). Keep the surface subdivision rate low for this. Too many faces, and the painting goes from slow to impossible.
Measurements
- Measurements are invisible by default and when loading a .blend file. Press the “Snow” button on the display tab to make them appear.
- Enable MeasureIt add-on (go to User Preferences > Add-ons).
- To add a new measurement, use the MeasureIt tool at the Toolbar > Display tab (press “t” if no toolbar is visible). Typically, the distance between two vertices is interesting. It updates as the object grows and shrinks etc. Give the measurement a name.
- For editing (and possibly removing) existing measurements, go to the Properties bar’s bottom (“n” for making it visible): There’s a “MeasureIt” subgroup — expand it.
- The units is meters by default. Edit the first measurement to be in millimeters, and the rest will follow.
- To measure the distance between vertices of different objects, select one vertex on each in Edit mode, and then pick “Link” in object mode with both objects selected.
- To measure the distance between the origins of two objects, enter Edit mode for both, make sure no vertex is selected in either, and then pick “Link” in object mode, with both objects selected.
Turning an SVG into a mesh
General note: Inkscape is a great tool for manipulating SVGs.
It might be a good idea to do this on a clean project, and import the result to another. A lot of objects are generated, so it will be easier to do operations with “select all”.
- Import the SVG file into blender (File > Import)
- A lot of curves appear (or possibly one). In Object Mode, select them all (possibly with Border Select) and go Object > Convert to > Mesh from (whatever). Keep an eye on the object hierarchy: The sub-elements turn from curves to meshes, even though the objects themselves are still carry names that imply curves. Neither does anything special happen on the 3d view. It’s easy to be misled into thinking nothing happened.
- Then Object > Join to get a single object of all segments (if there were many curves to begin with). It’s a good idea to rename to object at this point.
- Select the said object, enter Edit mode and go Mesh > Vertices > Remove doubles. This connects edges, that were previously individual curves, into continuous shapes. Actually, this merges double vertices rather than removing them.
- In Edit Mode, select everything and scale as necessary. The reason for Edit Mode is that scaling in Object Mode requires applying transformation, or edge lengths displayed as info in Edit Mode is wrong.
- For an SVG file exported by gerbv, I needed to enlarge by 1000. Possibly because the units in Blender were set to millimeters, and gerbv exported in meters…?
- Generating text: In GIMP, right click on the text layer (in Layer view) and select “Text to Path”. Then go for the Path tab, right-click the path with the text and click “Export…”. Select a file name, set the extension to .svg. This creates a single curve, so it’s OK to import it directly into the main project. Scale it (grow) by 100 or so immediately after importing for some sensible size. Pay attention to inner holes in letters (e.g. in O and B), so they aren’t filled by mistake. Use Knife Project tool to cut into an existing face.
Converting a mesh into SVG (and then to pdf)
Be sure to activate the “Curve: Curve tools addon” (Freestyle SVG exporter is not necessary for exporting a simple curve, by the way).
To export a curve into SVG:
- Object > Convert > Curve from Mesh / Text
- File > Export > Curves (.svg)
To export it to pdf, use inkscape as a GUI tool. It’s also possible to use inkscape as a command-line utility.
When a pdf file is obtained, convert it to PS with pdf2ps, and verify that the bounding box is correct (i.e. the dimensions are OK). Check the HiResBoundingBox. The values are given in points, 1 pt ≈ 0.352778 mm.
Notes on Blender 2.90.1
As already mentioned, the notes above relate to Blender 2.79b. That’s the version just before a significant reorganization of the user interface’s look and feel. All changes that I’ve encountered so far made sense to me, so I have no complaints. But I wrote down a few points about the differences:
- First, the most notable difference. Selection with left-click, not right click.
- Workspaces: Independent setups of areas. Use Window > New Window to get a workspace in a separate windows (old school style) or duplicate the main windows with Window > New Main Window.
- Menus appear as a hamburger icon at the top left, unless Header > Show menus is picked.
- Organizing the workspace: Right-click on edge of an area and choose to split or join (this was previously done by dragging).
- If you mess up the working environment (in particular the header’s layout), just create a new Modeling workspace, and delete the current one (or another type of workspace, if so necessary).
- There are hints on the mouse’s function at the window’s bottom.
- The mode (Object mode, Edit mode etc.) is at the upper left. Toggle between Object mode and Edit mode with Tab.
- F3 instead of spacebar for command search.
- Press “a” twice rapidly to deselect all (it previously toggled select / deselect). Or just click somewhere random.
- “n” and “t” can still be used to toggle the visibility of the Properties / Tool / View region and the Toolbox, however these can also be made visible by clicking the small arrows at the upper left and right. For the properties, it’s possible to allocate a separate area for it.
- “Remove doubles” is now Mesh > Cleanup > Merge by distance.
- Switching between wireframe, solid, material preview and rendered: With z + a keystroke or click in a certain region. Used up be “z” to switch just between wireframe or not.
- And there’s X-raw view! Toggle with Alt-z (or button at top right).
- Layers are replaced with “collections”. Right-click a collection to determine its visibility in Viewpoints and/or Render.
- Saving: Just press CTRL+s, like any normal program. No confirmation window.
- To change the measurement units: The icons are now lined up vertically, so the Scene Properties are at the fifth icon from the top on the outliner area (usually to the right). Expand the Units group.
- To enable display of edge lengths, normal etc: Go to Overlays drop-down, which opens in response to one of the triangle buttons to the top right.
- Vertex / Edge / Face select: Buttons at top left, next to “Edit mode”, or press 1/2/3 buttons, but not those on the numeric keypad.
- Setting the scene’s units to millimetric: Under Scene properties (fifth icon in the outline area), expand Units, and set Unit Scale to 0.001 on the Metric system. Then pick the Overlay drop down menu (to the top right in the 3D area, and set scale to 0.001 as well, so the grid is tight enough).
- Measurement: There is a really good measurement tool at the left tool shelf (toggle visibility with “t”). To delete all measurements, go to the properties panel (toggle visibility with “n”), select the View tab, expand the Annotations group, and delete RulerData3D by clicking “-”. The measurements won’t disappear right away, but they won’t survive moving away from the measurement tool (and back).
- Blender Preferences > Navigation, uncheck Auto Perspective. This seems like a new “feature” that the view mode automatically switches to perspective all the time. Super annoying.
By the way, everything that is drawn except the modeled objects (axes, vertex points etc.) are called “overlays”.
Sources
Introduction
This post shows how to access some GPIO functionalities from Xillinux running on a Z-Turn Lite board (with an Z-turn Lite IO Cape board attached), directly from the command line.
Watchdog
When the “WD” jumper at J26 on the board is placed, it’s possible to utilize the board’s watchdog chip, which resets the processor if its watchdog-clear pin isn’t toggled for 1.2 seconds. If that pin is in high-Z, the watchdog is inactive, and doesn’t reset the processor even if no toggling has taken place. This can be achieved either by removing the “WD” jumper, which floats the pin, or making the pin high-Z by setting the relevant GPIO to an input (Xillinux ensures the latter, so booting it with the “WD” jumper is safe).
When the pin is high-Z, a small sawtooth-like pulse, which is a few microseconds wide, is visible with an oscilloscope every 1.2 sec, and it’s the watchdog driving the pin to verify that the wire is in high-Z.
The watchdog’s clear pin is wired to the Zynq’s PS-only pin MIO0, which is configured as GPIO 0.
To take control of this pin from the command line:
# echo 0 > /sys/class/gpio/export
# echo out > /sys/class/gpio/gpio0/direction
These commands turn the GPIO into an output, and hence it’s not a high-Z anymore. The pin must start toggling every 1.2 seconds from this moment, or the processor is reset.
To prevent this reset, the following command can be used:
# while [ 1 ] ; do echo 1 > /sys/class/gpio/gpio0/value ; echo 0 > /sys/class/gpio/gpio0/value ; sleep 0.5 ; done
This works on MYiR’s OOB Linux as well (not just Xillinux).
Sensing the IO Cape board’s pushbutton
Not to be confused with the button on the Z-Turn Lite board itself, this is how to fetch the value of the button on the IO Cape Board:
# echo 88 > /sys/class/gpio/export
# cat /sys/class/gpio/gpio88/direction
in
# cat /sys/class/gpio/gpio88/value
This prints out 0 or 1, depending on the button’s state.
Controlling the IO Cape board’s J8 pins
Out of the box, Xillinux routes 34 GPIO I/Os to the IO Cape board’s J8 connector. This can be modified easily by editing the top-level module of Xillinux’ logic design, but this is beyond this post’s scope.
The 34 pins are wired to the connector’s pins 3 to 36. In Linux, to access pin N on the J8, request GPIO number N+51.
For example, in order to toggle pin J8/3, the GPIO to request is 3 + 51 = 54, so the following commands at shell prompt cause some fast toggling:
# echo 54 > /sys/class/gpio/export
# echo out > /sys/class/gpio/gpio54/direction
# while [ 1 ] ; do echo 1 > /sys/class/gpio/gpio54/value ; echo 0 > /sys/class/gpio/gpio54/value ; done
The GPIO pins can also be used as inputs, by following the standard Linux API for GPIO. Note however that pins J8/31 and J8/34 are pulled up with resistors on the IO Cape board.
Trying to running Vivado 2017.3 with GUI and all on a remote host with X forwarding, i.e.
$ ssh -X mycomputer
setting the environment with
$ . /path/to/Vivado/2017.3/settings64.sh
it failed with
$ vivado &
terminate called after throwing an instance of 'std::runtime_error'
what(): locale::facet::_S_create_c_locale name not valid
Now here’s the odd thing: The error message is actually helpful! It is a locale problem:
$ locale
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
LANG=en_IL
LC_CTYPE="en_IL"
LC_NUMERIC="en_IL"
LC_TIME="en_IL"
LC_COLLATE="en_IL"
LC_MONETARY="en_IL"
LC_MESSAGES="en_IL"
LC_PAPER="en_IL"
LC_NAME="en_IL"
LC_ADDRESS="en_IL"
LC_TELEPHONE="en_IL"
LC_MEASUREMENT="en_IL"
LC_IDENTIFICATION="en_IL"
LC_ALL=
Checking on a non-ssh terminal, all read “en_US.UTF-8″ instead. The problem seems to be that the SSH is from a newer Linux distro to an older one. “en_IL” is indeed the locale on the newer machine, which is OK there. And SSH changed the locale (which I believe one can avoid, but it’s not worth the effort given the simple workaround below).
So the fix is surprisingly simple:
$ export LC_ALL=en_US.UTF-8
and then check again:
$ locale
LANG=en_IL
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=en_US.UTF-8
OK, so LANG is still rubbish, but after this Vivado 2017.3′s GUI is up and running.
I should mention that there was no problem starting Vivado 2014.4′s GUI, but instead it crashed somewhere in the middle of the implementation. Once again, the fixing the locale solved this.
Introduction
Xilinx’ documentation says that XC7Z007S, among other “S” devices, is a single-core device, as opposed to, for example, its older brother XC7Z010, which is dual-core. So I compared several aspects of the PS part of a Z007S vs. Z010, and to my astonishment, I found that Z007S is exactly the same: Two CPUs are reported by the hardware itself, SMP is kicked off on both, and a simple performance test I made showed that Z007S runs two processes in parallel as fast as Z010.
So the question is: In what sense is XC7Z007S single-core? For now, I have no answer to that. I’ll update this post as soon as someone manages to explain this to me. In the meanwhile, I’ve tried to get this figured out in Xilinx’ forum.
The rest of this post outlines the various similarities between the Z007S vs. Z010 I tested. The PL bitfiles of different Zynq devices are incompatible, so there’s no chance I mistook which devices I worked with.
The tests below were made with Xillinux-2.0 (kernel v4.4) on two Z-turn Lite boards, one carrying Z007S, and one Z010.
Found 2 CPUs?
I started wondering when the kernel’s dmesg log indicated that it had found 2 CPUs on a Z007S:
[ 0.132523] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
[ 0.132586] Setting up static identity map for 0x82c0 - 0x82f4
[ 0.310962] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
[ 0.311065] Brought up 2 CPUs
[ 0.311102] SMP: Total of 2 processors activated (2664.03 BogoMIPS).
[ 0.311121] CPU: All CPU(s) started in SVC mode.
Also, /proc/cpuinfo consistently listed two CPUs. One could think that it’s because two CPUs are declared in the device tree, but removing one of them makes no difference.
On Z010, the exact same log and appears in this matter, and /proc/cpuinfo says the same.
CPU’s hardware register reporting two CPUs
According to the Zynq-7000 AP SoC Technical Reference Manual (UG585), the processor’s SCU_CONFIGURATION_REGISTER indicates the number of CPUs present in the Cortex-A9 MPCore processor in bits 1:0. Binary ’01′ means two Cortex-A9 processors, CPU0 and CPU1. Binary ’00′ means one Cortex-A9 processor, CPU0.
Using Xillinux-2.0′s poke kernel utility to read the processor’s SCU_CONFIGURATION_REGISTER register, I got exactly the same result on Z007S and Z010:
poke read addr=f8f00004: value=00000511
In other words, both devices report two processors.
I’m under the impression that the kernel uses this register to tell the number of CPUs by virtue of the scu_get_core_count() (defined in arch/arm/kernel/smp_scu.c) function, called by zynq_smp_init_cpus() in arch/arm/mach-zynq/platsmp.c.
The latter function sets the kernel’s “CPU possible” bits, so it’s how the Zynq-specific kernel setup code tells the kernel framework which CPUs indexes are good for use.
Also, the U-Boot code used by Xillinux for Z-Turn Lite prints out the processor count, based upon SCU_CONFIGURATION_REGISTER, as well as other info. For Z007S it gave:
U-Boot 2013.07 (Sep 17 2018 - 11:51:45)
Detected device ID code 0x3 (XC7Z007S) with 2 CPU(s), PS_VERSION = 3
Strapped boot mode: 5 (SD Card)
and for Z010:
U-Boot 2013.07 (Sep 17 2018 - 11:51:45)
Detected device ID code 0x2 (XC7Z010) with 2 CPU(s), PS_VERSION = 3
Strapped boot mode: 5 (SD Card)
A simple benchmark test
The proof is in the pudding. I wrote a simple program, which forks into two processes, each running a certain amount of plain CPU-intensive operations, and then quits. The output of this program is of no interest, but it’s printed out to avoid the compiler from optimizing away the crunching. Its listing is given at the end of this post for reference.
Using the “time” utility to measure the execution times, I ran the program on Z007S and Z010, and consistently got the same results, except for slight fluctuations:
# time ./work 400
Parent process done with LSR at e89c4641
Child process quitting with LSR at e89c4641
Parent process quitting
real 0m3.604s
user 0m7.030s
sys 0m0.010s
The 3.6 seconds given as “real” is the wall clock time. The 7 seconds of “user” time is the amount of consumed CPU. And as one would expect from a program that runs on two processes on a dual core machine, the consumed CPU time is approximately double the wall clock time. This is the result I expected from Z010, but not from Z007S.
Just to be sure I wasn’t being silly, I booted the kernel with “nosmp” in the kernel command line, which forced a single-CPU bringup. Indeed, the kernel reported finding one CPU in its logs, and /proc/cpuinfo reflected that as well.
And the pudding?
# time ./work 400
Parent process done with LSR at e89c4641
Child process quitting with LSR at e89c4641
Parent process quitting
real 0m6.998s
user 0m6.970s
sys 0m0.010s
Exactly as expected: With one processor, forking into two processes has no advantage. The CPU time is the wall clock time. I waited twice as long for it to finish.
At some point I suspected that the specific Linux version I used had a specific scheduler issue, which allowed a single-core CPU to perform as well as a dual-core. However the dual-core results were repeated on a Zybo board with three completely different kernels (except Xillinux-2.0) and yielded the same results (or slightly worse, with older kernels).
Conclusion
Given the results above, it’s not clear why Z007S is labeled as a single-core device. It’s not a matter of how it quacks or walks, but in the end, the device performs twice as fast when the work is split into two processes.
Or I missed something here. Kindly comment below if you found my mistake.
———————————–
Appendix: The benchmark program’s listing
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <time.h>
#include <signal.h>
#include <errno.h>
#include <string.h>
#include <sys/wait.h>
static unsigned int lsr_state;
int main(int argc, char *argv[]) {
int count, i, j, bit;
pid_t pid;
if (argc != 2) {
fprintf(stderr, "Usage: %s count\n", argv[0]);
exit(1);
}
count = atoi(argv[1]);
lsr_state = 1;
pid = fork();
if (pid < 0) {
perror("Failed to fork");
exit(1);
}
for (i=0; i<count; i++)
for (j=0; j<(1<<20); j++) {
bit = ((lsr_state >> 19) ^ (lsr_state >> 2)) & 0x01;
lsr_state = (lsr_state << 1) | bit;
if (lsr_state == 0) {
fprintf(stderr, "Huh? The LSR state is zero!\n");
exit(1);
}
}
if (pid == 0) {
fprintf(stderr, "Child process quitting with LSR at %x\n", lsr_state);
return 0;
}
fprintf(stderr, "Parent process done with LSR at %x\n", lsr_state);
pid = wait(&i);
fprintf(stderr, "Parent process quitting\n");
return 0;
}
saved as work.c, compiled with
# gcc -O3 -Wall work.c -o work
directly on the Zynq board itself (Xillinux comes with a native gcc compiler). But cross compilation should make no difference.
Ever wanted to see how the a Linux USB host talks with its PHY with ULPI commands? Probably not. But if you do, here’s how I did it on a Zynq device, connected to an USB3320 USB 2.0 PHY chip. Note that:
- The relevant sources must be compiled into the kernel. Modules are loaded too late. The choice of PHY frontend is made when the USB driver is initialized, and if the relevant driver isn’t handy, a generic PHY is picked instead…
- … which is most likely as good. In retrospect, there’s is very little reason to load the actual driver.
- In particular, my system works great without the dedicated USB PHY driver.
So it’s about adding a plain pr_info() into the kernel’s drivers/usb/phy/phy-ulpi-viewport.c, so it prints every ULPI register write command to the kernel log. Added code marked in red:
static int ulpi_viewport_write(struct usb_phy *otg, u32 val, u32 reg)
{
int ret;
void __iomem *view = otg->io_priv;
pr_info("ulpi_viewport_write: reg 0x%04x = 0x%02x\n",
reg, val);
writel(ULPI_VIEW_WAKEUP | ULPI_VIEW_WRITE, view);
ret = ulpi_viewport_wait(view, ULPI_VIEW_WAKEUP);
if (ret)
return ret;
writel(ULPI_VIEW_RUN | ULPI_VIEW_WRITE | ULPI_VIEW_DATA_WRITE(val) |
ULPI_VIEW_ADDR(reg), view);
return ulpi_viewport_wait(view, ULPI_VIEW_RUN);
}
And that’s it. One can also cover the ulpi_viewport_read() method in the same way, but it wasn’t important to me (I wanted to the powering on of Vbus).
The relevant part in my device tree read:
usb_phy0: phy0 {
compatible = "ulpi-phy";
#phy-cells = <0>;
reg = <0xe0002000 0x1000>;
view-port = <0x0170>;
drv-vbus;
};
usb0: usb@e0002000 {
compatible = "xlnx,zynq-usb-2.20a", "chipidea,usb2";
clocks = <&clkc 28>;
interrupt-parent = <&ps7_scugic_0>;
interrupts = <0 21 4>;
reg = <0xe0002000 0x1000>;
phy_type = "ulpi";
dr_mode = "host";
usb-phy = <&usb_phy0>;
};
And this is what I got in the dmesg log:
[ 1.396317] ulpi_phy_probe() invoked
[ 1.399968] ulpi_phy_probe() returns successfully
[ 1.405148] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[ 1.418505] ehci-pci: EHCI PCI platform driver
[ 1.429765] ehci-platform: EHCI generic platform driver
[ 1.441924] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[ 1.454951] ohci-pci: OHCI PCI platform driver
[ 1.466237] ohci-platform: OHCI generic platform driver
[ 1.478504] uhci_hcd: USB Universal Host Controller Interface driver
[ 1.492047] usbcore: registered new interface driver usb-storage
[ 1.505250] chipidea-usb2 e0002000.usb: ci_hdrc_usb2_probe invoked
[ 1.518410] e0002000.usb supply vbus not found, using dummy regulator
[ 1.532049] ci_hdrc ci_hdrc.0: ChipIdea HDRC found, revision: 22, lpm: 0; cap: e0d0a100 op: e0d0a140
[ 1.532062] ulpi_init() invoked
[ 1.542033] ULPI transceiver vendor/product ID 0x0424/0x0007
[ 1.554611] Found SMSC USB3320 ULPI transceiver.
[ 1.566118] ulpi_viewport_write: reg 0x0016 = 0x55
[ 1.577825] ulpi_viewport_write: reg 0x0016 = 0xaa
[ 1.589383] ULPI integrity check: passed.
[ 1.600057] ulpi_viewport_write: reg 0x000a = 0x06
[ 1.611516] ulpi_viewport_write: reg 0x0007 = 0x00
[ 1.622872] ulpi_viewport_write: reg 0x0004 = 0x41
[ 1.634203] ci_hdrc ci_hdrc.0: It is OTG capable controller
[ 1.634233] ci_hdrc ci_hdrc.0: EHCI Host Controller
[ 1.645628] ci_hdrc ci_hdrc.0: new USB bus registered, assigned bus number 1
[ 1.672482] ci_hdrc ci_hdrc.0: USB 2.0 started, EHCI 1.00
[ 1.684475] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002
[ 1.697770] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 1.711521] usb usb1: Product: EHCI Host Controller
[ 1.722832] usb usb1: Manufacturer: Linux 4.4.30-xillinux-2.0 ehci_hcd
[ 1.735797] usb usb1: SerialNumber: ci_hdrc.0
[ 1.747236] hub 1-0:1.0: USB hub found
[ 1.757421] hub 1-0:1.0: 1 port detected
[ 1.767881] ulpi_viewport_write: reg 0x000a = 0x67
The log entries in green above are just some other similar debug outputs I made, and they pretty much explain themselves.
Did you note that the ULPI was detected by vendor ID / product ID? It’s for real. These were obtained by ULPI registers read (not shown above). I’m not all that convinced that this detection made any difference, except for printing out the name of the device.
As for the meaning of these ulpi_viewport_write dumps, most is pretty boring: The first two writes to address 0x16 do nothing. It’s a scratch pad register. Most likely used by the driver to test the ULPI interface.
The following three writes just assign the default values. So this does effectively nothing as well.
The last write to register 0x0a (OTG register) sets bits 6, 5 and 0, which are DrvVbusExternal, DrvVbus and IdPullup. The interesting part to me was DrvVbusExternal and DrvVbus, because setting any of these two (or both) causes the chip’s CPEN pin to go high, which turns on the power supply for Vbus. This is the point where the USB port starts behaving like a host and feeds power.
Introduction
Whenever a PLL is used in a design to generate one clock from another, it’s quite common to expect the timing tools to figure out the frequencies and timing relations between the different clocks.
With Intel’s Quartus tools, this isn’t the case by default. A derive_pll_clocks command is required in the SDC constraints file for this happen. And indeed, this command appears in virtually any SDC file that is generated automatically by the tools.
But here’s the scary thing: If derive_pll_clocks is omitted, one would expect that the PLL’s output clocks would not be timed at all, and that the relevant paths would be listed as unconstrained. Unfortunately, it’s different: As shown below, timing calculations are made for these paths, but with wrong figures. So one might get the impression that the timing constraints were met and all is fine, but in fact nothing is assured.
An example
Let’s say that the FPGA has an oscillator input of 48 MHz (hence a period of 20.833 ns), from which a PLL generates a 240 MHz clock (with a period of 4.166 ns).
First let’s take a simple, properly written, SDC file going:
create_clock -name root_clk -period 20.833 [get_ports {osc_clock}]
derive_pll_clocks
derive_clock_uncertainty
Note that the derive_pll_clocks command is there.
Now let’s look at the timing report for a path between two registers, which are clocked by the derived clock. The only interesting part is marked with red:
+-------------------------------------------------------------------------------------------------------------------------------------------------+
; Data Arrival Path ;
+---------+----------+----+------+--------+------------------------+------------------------------------------------------------------------------+
; Total ; Incr ; RF ; Type ; Fanout ; Location ; Element ;
+---------+----------+----+------+--------+------------------------+------------------------------------------------------------------------------+
; 0.000 ; 0.000 ; ; ; ; ; launch edge time ;
; 4.937 ; 4.937 ; ; ; ; ; clock path ;
; 0.000 ; 0.000 ; ; ; ; ; source latency ;
; 0.000 ; 0.000 ; ; ; 1 ; PIN_B12 ; osc_clock ;
; 0.000 ; 0.000 ; RR ; IC ; 1 ; IOIBUF_X19_Y29_N8 ; osc_clock~input|i ;
; 0.667 ; 0.667 ; RR ; CELL ; 2 ; IOIBUF_X19_Y29_N8 ; osc_clock~input|o ;
; 2.833 ; 2.166 ; RR ; IC ; 1 ; PLL_3 ; clkrst_ins|altpll_component|auto_generated|pll1|inclk[0] ;
; 1.119 ; -1.714 ; RR ; COMP ; 1 ; PLL_3 ; clkrst_ins|altpll_component|auto_generated|pll1|observablevcoout ;
; 1.119 ; 0.000 ; RR ; CELL ; 1 ; PLL_3 ; clkrst_ins|altpll_component|auto_generated|pll1|clk[0] ;
; 3.274 ; 2.155 ; RR ; IC ; 1 ; CLKCTRL_G13 ; clkrst_ins|altpll_component|auto_generated|wire_pll1_clk[0]~clkctrl|inclk[0] ;
; 3.274 ; 0.000 ; RR ; CELL ; 8 ; CLKCTRL_G13 ; clkrst_ins|altpll_component|auto_generated|wire_pll1_clk[0]~clkctrl|outclk ;
; 4.336 ; 1.062 ; RR ; IC ; 1 ; FF_X40_Y24_N27 ; clkrst_ins|main_state[0]|clk ;
; 4.937 ; 0.601 ; RR ; CELL ; 1 ; FF_X40_Y24_N27 ; clkrst:clkrst_ins|main_state[0] ;
; 6.774 ; 1.837 ; ; ; ; ; data path ;
; 5.169 ; 0.232 ; ; uTco ; 1 ; FF_X40_Y24_N27 ; clkrst:clkrst_ins|main_state[0] ;
; 5.169 ; 0.000 ; FF ; CELL ; 5 ; FF_X40_Y24_N27 ; clkrst_ins|main_state[0]|q ;
; 5.591 ; 0.422 ; FF ; IC ; 1 ; LCCOMB_X40_Y24_N24 ; clkrst_ins|Equal1~0|dataa ;
; 6.002 ; 0.411 ; FR ; CELL ; 1 ; LCCOMB_X40_Y24_N24 ; clkrst_ins|Equal1~0|combout ;
; 6.370 ; 0.368 ; RR ; IC ; 1 ; DDIOOUTCELL_X41_Y24_N4 ; clkrst_ins|the_register|d ;
; 6.774 ; 0.404 ; RR ; CELL ; 1 ; DDIOOUTCELL_X41_Y24_N4 ; clkrst:clkrst_ins|the_register ;
+---------+----------+----+------+--------+------------------------+------------------------------------------------------------------------------+
+-------------------------------------------------------------------------------------------------------------------------------------------------+
; Data Required Path ;
+---------+----------+----+------+--------+------------------------+------------------------------------------------------------------------------+
; Total ; Incr ; RF ; Type ; Fanout ; Location ; Element ;
+---------+----------+----+------+--------+------------------------+------------------------------------------------------------------------------+
; 4.166 ; 4.166 ; ; ; ; ; latch edge time ;
; 9.005 ; 4.839 ; ; ; ; ; clock path ;
; 4.166 ; 0.000 ; ; ; ; ; source latency ;
; 4.166 ; 0.000 ; ; ; 1 ; PIN_B12 ; osc_clock ;
; 4.166 ; 0.000 ; RR ; IC ; 1 ; IOIBUF_X19_Y29_N8 ; osc_clock~input|i ;
; 4.833 ; 0.667 ; RR ; CELL ; 2 ; IOIBUF_X19_Y29_N8 ; osc_clock~input|o ;
; 6.912 ; 2.079 ; RR ; IC ; 1 ; PLL_3 ; clkrst_ins|altpll_component|auto_generated|pll1|inclk[0] ;
; 5.119 ; -1.793 ; RR ; COMP ; 1 ; PLL_3 ; clkrst_ins|altpll_component|auto_generated|pll1|observablevcoout ;
; 5.119 ; 0.000 ; RR ; CELL ; 1 ; PLL_3 ; clkrst_ins|altpll_component|auto_generated|pll1|clk[0] ;
; 7.187 ; 2.068 ; RR ; IC ; 1 ; CLKCTRL_G13 ; clkrst_ins|altpll_component|auto_generated|wire_pll1_clk[0]~clkctrl|inclk[0] ;
; 7.187 ; 0.000 ; RR ; CELL ; 8 ; CLKCTRL_G13 ; clkrst_ins|altpll_component|auto_generated|wire_pll1_clk[0]~clkctrl|outclk ;
; 8.199 ; 1.012 ; RR ; IC ; 1 ; DDIOOUTCELL_X41_Y24_N4 ; clkrst_ins|the_register|clk ;
; 8.736 ; 0.537 ; RR ; CELL ; 1 ; DDIOOUTCELL_X41_Y24_N4 ; clkrst:clkrst_ins|the_register ;
; 9.005 ; 0.269 ; ; ; ; ; clock pessimism removed ;
; 8.985 ; -0.020 ; ; ; ; ; clock uncertainty ;
; 8.890 ; -0.095 ; ; uTsu ; 1 ; DDIOOUTCELL_X41_Y24_N4 ; clkrst:clkrst_ins|the_register ;
+---------+----------+----+------+--------+------------------------+------------------------------------------------------------------------------+
Aside from all the mumbo-jumbo, there’s the “latch edge time” line, which is the time of the edge of the clock that will propagate through the clock network and become the latching clock on the receiving register. As this is the case of a plain register-to-register path, both clocked with the rising edge of the same clock, the “latch edge time” is simply the clock’s period. Indeed 4.166 ns. So far so good.
But then disaster
Let’s see what happens if the derive_pll_clocks command is omitted. In other words, the SDC file reads:
create_clock -name root_clk -period 20.833 [get_ports {osc_clock}]
derive_clock_uncertainty
For exactly the same path, the timing report reads:
+-------------------------------------------------------------------------------------------------------------------------------------------------+
; Data Arrival Path ;
+---------+----------+----+------+--------+------------------------+------------------------------------------------------------------------------+
; Total ; Incr ; RF ; Type ; Fanout ; Location ; Element ;
+---------+----------+----+------+--------+------------------------+------------------------------------------------------------------------------+
; 0.000 ; 0.000 ; ; ; ; ; launch edge time ;
; 4.937 ; 4.937 ; ; ; ; ; clock path ;
; 0.000 ; 0.000 ; ; ; ; ; source latency ;
; 0.000 ; 0.000 ; ; ; 1 ; PIN_B12 ; osc_clock ;
; 0.000 ; 0.000 ; RR ; IC ; 1 ; IOIBUF_X19_Y29_N8 ; osc_clock~input|i ;
; 0.667 ; 0.667 ; RR ; CELL ; 2 ; IOIBUF_X19_Y29_N8 ; osc_clock~input|o ;
; 2.833 ; 2.166 ; RR ; IC ; 1 ; PLL_3 ; clkrst_ins|altpll_component|auto_generated|pll1|inclk[0] ;
; 1.119 ; -1.714 ; RR ; COMP ; 1 ; PLL_3 ; clkrst_ins|altpll_component|auto_generated|pll1|observablevcoout ;
; 1.119 ; 0.000 ; RR ; CELL ; 1 ; PLL_3 ; clkrst_ins|altpll_component|auto_generated|pll1|clk[0] ;
; 3.274 ; 2.155 ; RR ; IC ; 1 ; CLKCTRL_G13 ; clkrst_ins|altpll_component|auto_generated|wire_pll1_clk[0]~clkctrl|inclk[0] ;
; 3.274 ; 0.000 ; RR ; CELL ; 8 ; CLKCTRL_G13 ; clkrst_ins|altpll_component|auto_generated|wire_pll1_clk[0]~clkctrl|outclk ;
; 4.336 ; 1.062 ; RR ; IC ; 1 ; FF_X40_Y24_N19 ; clkrst_ins|main_state[0]|clk ;
; 4.937 ; 0.601 ; RR ; CELL ; 1 ; FF_X40_Y24_N19 ; clkrst:clkrst_ins|main_state[0] ;
; 6.765 ; 1.828 ; ; ; ; ; data path ;
; 5.169 ; 0.232 ; ; uTco ; 1 ; FF_X40_Y24_N19 ; clkrst:clkrst_ins|main_state[0] ;
; 5.169 ; 0.000 ; FF ; CELL ; 5 ; FF_X40_Y24_N19 ; clkrst_ins|main_state[0]|q ;
; 5.583 ; 0.414 ; FF ; IC ; 1 ; LCCOMB_X40_Y24_N24 ; clkrst_ins|Equal1~0|datab ;
; 5.994 ; 0.411 ; FR ; CELL ; 1 ; LCCOMB_X40_Y24_N24 ; clkrst_ins|Equal1~0|combout ;
; 6.361 ; 0.367 ; RR ; IC ; 1 ; DDIOOUTCELL_X41_Y24_N4 ; clkrst_ins|the_register|d ;
; 6.765 ; 0.404 ; RR ; CELL ; 1 ; DDIOOUTCELL_X41_Y24_N4 ; clkrst:clkrst_ins|the_register ;
+---------+----------+----+------+--------+------------------------+------------------------------------------------------------------------------+
+--------------------------------------------------------------------------------------------------------------------------------------------------+
; Data Required Path ;
+----------+----------+----+------+--------+------------------------+------------------------------------------------------------------------------+
; Total ; Incr ; RF ; Type ; Fanout ; Location ; Element ;
+----------+----------+----+------+--------+------------------------+------------------------------------------------------------------------------+
; 20.833 ; 20.833 ; ; ; ; ; latch edge time ;
; 25.672 ; 4.839 ; ; ; ; ; clock path ;
; 20.833 ; 0.000 ; ; ; ; ; source latency ;
; 20.833 ; 0.000 ; ; ; 1 ; PIN_B12 ; osc_clock ;
; 20.833 ; 0.000 ; RR ; IC ; 1 ; IOIBUF_X19_Y29_N8 ; osc_clock~input|i ;
; 21.500 ; 0.667 ; RR ; CELL ; 2 ; IOIBUF_X19_Y29_N8 ; osc_clock~input|o ;
; 23.579 ; 2.079 ; RR ; IC ; 1 ; PLL_3 ; clkrst_ins|altpll_component|auto_generated|pll1|inclk[0] ;
; 21.786 ; -1.793 ; RR ; COMP ; 1 ; PLL_3 ; clkrst_ins|altpll_component|auto_generated|pll1|observablevcoout ;
; 21.786 ; 0.000 ; RR ; CELL ; 1 ; PLL_3 ; clkrst_ins|altpll_component|auto_generated|pll1|clk[0] ;
; 23.855 ; 2.069 ; RR ; IC ; 1 ; CLKCTRL_G13 ; clkrst_ins|altpll_component|auto_generated|wire_pll1_clk[0]~clkctrl|inclk[0] ;
; 23.855 ; 0.000 ; RR ; CELL ; 8 ; CLKCTRL_G13 ; clkrst_ins|altpll_component|auto_generated|wire_pll1_clk[0]~clkctrl|outclk ;
; 24.867 ; 1.012 ; RR ; IC ; 1 ; DDIOOUTCELL_X41_Y24_N4 ; clkrst_ins|the_register|clk ;
; 25.404 ; 0.537 ; RR ; CELL ; 1 ; DDIOOUTCELL_X41_Y24_N4 ; clkrst:clkrst_ins|the_register ;
; 25.672 ; 0.268 ; ; ; ; ; clock pessimism removed ;
; 25.572 ; -0.100 ; ; ; ; ; clock uncertainty ;
; 25.477 ; -0.095 ; ; uTsu ; 1 ; DDIOOUTCELL_X41_Y24_N4 ; clkrst:clkrst_ins|the_register ;
+----------+----------+----+------+--------+------------------------+------------------------------------------------------------------------------+
So it’s exactly the same analysis, only assuming that the clock period is 20.833 ns (note the “latch edge time” again). Note that the analysis traverses the PLL, but simply ignores the fact that the PLL’s output has another frequency. It’s as if the tools were saying: You forgot the derive_pll_clocks constraint? No problem. We’ll play as if the PLL’s input clock went right through it.
Frankly, I can’t think about a single case where this behavior would make sense. Either don’t calculate the timing of paths of the derived clock, or do it correctly. But just throwing in the original clock’s period? These incorrectly constrained paths don’t appear in the unconstrained path summary, nor is there any other indication that the timing is horribly wrong.
To the tools’ defense, the timing analysis produces warnings on this matter, but none at a Critical level, so it’s easy to miss them in the sea of warnings that FPGA tools always generate.
Bottom line
- Make sure your design has the derive_pll_clocks command if you have a PLL involved (unless you’ve added explicit constraints for the derived clocks)
- Be sure to generate a timing report, to read and understand it.
- Always test your constraints by requiring impossible values, and verify that the failing paths are calculated correctly
Why is it at 2.5 GT/s???
With all said about Nvidia’s refusal to release their drivers as open source, their Linux support is great. I don’t think I’ve ever had such a flawless graphics card experience with Linux. After replacing the nouveau driver with Nvidia’s, of course. Ideology is nice, but a computer that works is nicer.
But then I looked at the output of lspci -vv (on an Asus fanless GT 730 2GB DDR3), and horrors, it’s not running at full PCIe speed!
17:00.0 VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 730] (rev a1) (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. GK208B [GeForce GT 730]
[ ... ]
Capabilities: [78] Express (v2) Legacy Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Exit Latency L0s <512ns, L1 <4us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
[ ... ]
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
Whatwhat? The card declares it supports 5 GT/s, but runs only at 2.5 GT/s? And on my brand new super-duper motherboard, which supports Gen3 PCIe connected directly to an Intel X-family CPU?
It’s all under control
Well, the answer is surprisingly simple: Nvidia’s driver changes the card’s PCIe speed dynamically to support the bandwidth needed. When there’s no graphics activity, the speed drops to 2.5 GT/s.
This behavior can be controlled with Nvidia’s X Server Settings control panel (it has an icon in the system’s setting panel, or just type “Nvidia” on Gnome’s start menu). Under the PowerMizer sub-menu, the card’s behavior can be changed to stay at 5 GT/s if you like your card hot and electricity bill fat.
Otherwise, in “Adaptive mode” it switches back and forth from 2.5 GT/s to 5 GT/s. The screenshot below was taken after a few seconds of idling (click to enlarge):

And this is how to force it to 5 GT/s constantly (click to enlarge):

With the latter setting, lspci -vv shows that the card is at 5 GT/s, as promised:
17:00.0 VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 730] (rev a1) (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. GK208B [GeForce GT 730]
[ ... ]
LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Exit Latency L0s <512ns, L1 <4us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
So don’t worry about a low speed on an Nvidia card (or make sure it steps up on request).
A word on GT 1030
I added another fanless card, Asus GT 1030 2GB, to the computer for some experiments. This card is somewhat harder to catch at 2.5 GT/s, because it steps up very quickly in response to any graphics event. But I managed to catch this:
65:00.0 VGA compatible controller: NVIDIA Corporation GP108 (rev a1) (prog-if 00 [VGA controller])
Subsystem: ASUSTeK Computer Inc. GP108 [GeForce GT 1030]
[ ... ]
LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <512ns, L1 <16us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
The running 2.5 GT/s speed vs. the maximal 8 GT/s is pretty clear by now, but the declared maximal Width is 4x? If so, why does it have a 16x PCIe form factor? The GT 730 has an 8x form factor, and uses 8x lanes, but GT 1030 has 16x and declares it can only use 4x? Is this some kind of marketing thing to make the card look larger and stronger?
On the other hand, show me a fairly recent motherboard without a 16x PCIe slot. The thing is that sometimes that slot can be used for something else, and the graphics card could then have gone into a vacant 4x slot instead. But no. Let’s make it big and impressive with a long PCIe plug that makes it look massive. Personally, I find the gigantic heatsink impressive enough.
TL;DR
DMA writes from a Cyclone 10 GX PCIe interface may be lost, probably due to a path that isn’t timed properly by the fitter. This has been observed with Quartus Prime Version 17.1.0 Build 240 SJ Pro Edition, and the official Cyclone 10 GX development board. A wider impact is likely, possibly on Arria 10 device as well (as its PCIe block is the same one).
The problem seems to be rare, and appears and disappears depending on how the fitter places the logic. It’s however fairly easy to diagnose if this specific problem is in effect (see “The smoking gun” below).
Computer hardware: Gigabyte GA-B150M-D2V motherboard (with an Intel B150 Chipset) + Intel i5-6400 CPU.
The story
It started with a routine data transport test (FPGA to host), which failed virtually immediately (that is, after a few kilobytes). It was apparent that some portions of data simply weren’t written into the DMA buffer by the FPGA.
So I tried a fix in my own code, and yep, it helped. Or so I thought. Actually, anything I changed seemed to fix the problem. In the end, I changed nothing, but just added
set_global_assignment -name SEED 2
to the QSF file. Which only changes the fitter’s initial placement of the logic elements, which eventually leads to an alternative placement and routing of the design. That should work exactly the same, of course. But it “solved the problem”.
This was consistent: One “magic” build that failed consistently, and any change whatsoever made the issue disappear.
The design was properly constrained, of course, as shown in the development board’s sample SDC file. In fact, there isn’t much to constrain: It’s just setting the main clock to 100 MHz, derive_pll_clocks and derive_clock_uncertainty. And a false path from the PERST pin.
So maybe my bad? Well, no. There were no unconstrained paths in the entire design (with these simple constraints), so one fitting of the design should be exactly like any other. Maybe my application logic? No again:
The smoking gun
The final nail in the coffin was when I noted errors in the PCIe Device Status Registers on both sides. I’ve discussed this topic in this and this other posts of mine, however in the current case no AER kernel messages were produced (unfortunately, and it’s not clear why).
And whatever the application code does, Intel / Altera’s PCIe block shouldn’t produce a link error, and neither it does normally. It’s a violation of the PCIe spec.
These are the steps for observing this issue on a Linux machine. First, find out who the link partners are:
$ lspci
00:00.0 Host bridge: Intel Corporation Device 191f (rev 07)
00:01.0 PCI bridge: Intel Corporation Device 1901 (rev 07)
[ ... ]
01:00.0 Unassigned class [ff00]: Altera Corporation Device ebeb
and then figuring out that the FPGA card is connected via the bridge at 00:01.0 with
$ lspci -t
-[0000:00]-+-00.0
+-01.0-[01]----00.0
So it’s between 00:01.0 and 01:00.0. Then, following that post of mine, using setpci to read from the status register to tell an error had occurred.
First, what it should look like: With any bitstream except that specific faulty one, I got
# setpci -s 01:00.0 CAP_EXP+0xa.w
0000
# setpci -s 00:01.0 CAP_EXP+0xa.w
0000
any time and all the time, which says the obvious: No errors sensed on either side.
But with the bitstream that had data losses, before any communication had taken place (except for the driver being loaded):
# setpci -s 01:00.0 CAP_EXP+0xa.w
0009
# setpci -s 00:01.0 CAP_EXP+0xa.w
0000
Non-zero means error. So at this stage the FPGA’s PCIe interface was unhappy with something (more on that below), but the processor’s side had no complaints.
I have to admit that I’ve seen the 0009 status in a lot of other tests, in which communication went through perfectly. So even though reflects some kind of error, it doesn’t necessarily predict any functional fault. As elaborated below, the 0009 status consists of correctable errors. It’s just that such errors are normally never seen (i.e. with any PCIe card that works properly).
Anyhow, back to the bitstream that did have data errors. After some data had been written by the FPGA:
# setpci -s 01:00.0 CAP_EXP+0xa.w
0009
# setpci -s 00:01.0 CAP_EXP+0xa.w
000a
In this case, the FPGA card’s link partner complained. To save ourselves the meaning of these numbers (even though the’re listed in that post), use lspci -vv:
# lspci -vv
00:01.0 PCI bridge: Intel Corporation Device 1901 (rev 07) (prog-if 00 [Normal decode])
[ ... ]
Capabilities: [a0] Express (v2) Root Port (Slot+), MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 256 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr- TransPend-
[ ... ]
So the bridge complained about an uncorrectable and an unsupported request only after the data transmission, but the FPGA side:
01:00.0 Unassigned class [ff00]: Altera Corporation Device ebeb
[ ... ]
Capabilities: [80] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
complained about a correctable error and an unsupported request (as seen above, that happened before any payload transmission).
Low-level errors. I couldn’t make this happen even if I wanted to.
Aftermath
The really bad news is that this problem isn’t in the logic itself, but in how it’s placed. It seems to be a rare and random occurrence of a poor job done by the fitter. Or maybe it’s not all that rare, if you let the FPGA heat up a bit. In my case a spinning fan kept an almost idle FPGA quite cool, I suppose.
The somewhat good news is that the data loss comes with these PCIe status errors, and maybe with the relevant kernel messages (not clear why I didn’t see any). So there’s something to hold on to.
And I should also mention that the offending PCIe interface was a Gen2 x 4 running with a 64-bit interface at 250 MHz. which a rather marginal frequency for Arria 10 / Cyclone 10. So going with the speculation that this is a timing issue that isn’t handled properly by the fitter, maybe sticking to 125 MHz interfaces on these devices is good enough to be safe against this issue.
Note to self: The outputs are kept in cyclone10-failure.tar.gz
Command-line?
Yes, it much more convenient than the GUI programmer. Programming an FPGA is a repeated task, always the same file to the same FPGA on the same board connected to the computer. And somehow the GUI programming tools turn it into a daunting ceremony (and sometimes even a quiz, when it can’t tell exactly which device is connected, so I’m supposed to nail the exact one).
With command line its literally picking the command from bash history, and press Enter. And surprisingly enough, the command line tool doesn’t ask the silly questions that the GUI tool does.
First, some mucking about
Set up the environment:
$ /path/to/quartus/15.1/nios2eds/nios2_command_shell.sh
To list all devices found (cable auto-detected):
$ quartus_pgm --auto
Info: *******************************************************************
Info: Running Quartus Prime Programmer
Info: Version 15.1.0 Build 185 10/21/2015 SJ Lite Edition
Info: Copyright (C) 1991-2015 Altera Corporation. All rights reserved.
[ ... ]
Info: agreement for further details.
Info: Processing started: Sun May 27 15:06:22 2018
Info: Command: quartus_pgm --auto
Info (213045): Using programming cable "USB-BlasterII [2-5.1]"
1) USB-BlasterII [2-5.1]
02B040DD 5CGTFD9(A5|C5|D5|E5)/..
020A40DD 5M2210Z/EPM2210
[ ... ]
Note that listing the devices as shown above is not necessary for programming. It might be useful to tell the position of the FPGA in the JTAG chain, maybe. Really something that is done once to explore the board.
jtagd
It’s important to be aware of this deamon, which listens to TCP/IP port 1309: It’s responsible for talking with the JTAG adapter through the USB bus, so both the GUI and command line programmer utilities rely on it. If there’s no daemon running, both of these launch it.
But if you use multiple versions of Quartus, this may be a source of confusion, in particular if you make a first attempt to program an FPGA with an older version, and then try a newer one. That’s because the newer version of Quartus will keep using the older version of jtagd, possibly failing to work with recent devices. Bottom line: If wonky things happen, this won’t hurt:
$ killall jtagd
Programming
quartus_pgm displays most of its output in green. Generally speaking, if there’s no red text, all went fine.
$ quartus_pgm -m jtag -o "p;path/to/file.sof"
Or add the position in the JTAG explicitly (in particular if it’s not the first device). In this case it’s @1, meaning it’s the first device in the JTAG chain. If it’s the second device, pick @2 etc.
$ quartus_pgm -m jtag -o "p;path/to/file.sof@1"
Info: *******************************************************************
Info: Running Quartus Prime Programmer
Info: Version 15.1.0 Build 185 10/21/2015 SJ Lite Edition
Info: Copyright (C) 1991-2015 Altera Corporation. All rights reserved.
Info: Your use of Altera Corporation's design tools, logic functions
Info: and other software and tools, and its AMPP partner logic
Info: functions, and any output files from any of the foregoing
Info: (including device programming or simulation files), and any
Info: associated documentation or information are expressly subject
Info: to the terms and conditions of the Altera Program License
Info: Subscription Agreement, the Altera Quartus Prime License Agreement,
Info: the Altera MegaCore Function License Agreement, or other
Info: applicable license agreement, including, without limitation,
Info: that your use is for the sole purpose of programming logic
Info: devices manufactured by Altera and sold by Altera or its
Info: authorized distributors. Please refer to the applicable
Info: agreement for further details.
Info: Processing started: Sun May 27 15:35:02 2018
Info: Command: quartus_pgm -m jtag -o p;path/to/file.sof@1
Info (213045): Using programming cable "USB-BlasterII [2-5.1]"
Info (213011): Using programming file p;path/to/file.sof@1 with checksum 0x061958E1 for device 5CGTFD9E5F35@1
Info (209060): Started Programmer operation at Sun May 27 15:35:05 2018
Info (209016): Configuring device index 1
Info (209017): Device 1 contains JTAG ID code 0x02B040DD
Info (209007): Configuration succeeded -- 1 device(s) configured
Info (209011): Successfully performed operation(s)
Info (209061): Ended Programmer operation at Sun May 27 15:35:09 2018
Info: Quartus Prime Programmer was successful. 0 errors, 0 warnings
Info: Peak virtual memory: 432 megabytes
Info: Processing ended: Sun May 27 15:35:09 2018
Info: Elapsed time: 00:00:07
Info: Total CPU time (on all processors): 00:00:03
If anything goes wrong — device mismatch, a failure to scan the JTAG chain or whatever, it will be hard to miss because of the errors written in red. The sweet thing with the command line interface is that every attempt starts from fresh, so just turn the board on (the usual reason for errors) and give it another go.
Cyclone 10 GX FPGA development kit
This board caused me some extra trouble, so a few words about it. When connected, it appears as 09fb:6810, however after attempting to program the FPGA (note the “@2″ in the end) with
$ quartus_pgm -m jtag -o "p;thecode.sof@2"
Error (213019): Can't scan JTAG chain. Error code 86.
it changes to 09fb:6010. So there’s clearly some reprogramming of firmware (the log shows a disconnection and reconnection with the new ID). The board is detected as GX0000406 by the Quartus GUI Programming Tool, but clicking “Auto Detect” yields “Unable to scan device chain. Hardware is not connected”.
OK, what about a scan?
$ quartus_pgm --auto
[ ... ]
Info (213045): Using programming cable "10CGX0000406 [1-5.1.2]"
1) 10CGX0000406 [1-5.1.2]
Unable to read device chain - Hardware not attached
The problem in my case was apparently that the jtagd running was launched by an older version of Quartus, which didn’t recognize Cyclone 10 devices. So follow the advice above, and kill it. After that, programming with the command above worked with Quartus Pro 17.1:
$ quartus_pgm --auto
[...]
Info (213045): Using programming cable "USB-BlasterII [1-5.1.2]"
1) USB-BlasterII [1-5.1.2]
031820DD 10M08SA(.|ES)/10M08SC
02E120DD 10CX220Y