北京十一选五开奖结果 www.frdg.net E XTENDED VERSION — S HORT VERSION APPEARED N ICE , F RANCE , O CTOBER 2003.
IN THE
9 TH ICCV,
MultipleView Structure and Motion From Line Correspondences
Adrien Bartoli Peter Sturm INRIA Rh? oneAlpes, 655, avenue de l’Europe 38334 Saint Ismier cedex, France. [email protected] Abstract
We address the problem of camera motion and structure reconstruction from line correspondences across multiple views, from initialization to ?nal bundle adjustment. One of the main dif?culties when dealing with line features is their algebraic representation. First, we consider the triangulation problem. Based on Pl¨ ucker coordinates to represent the lines, we propose a maximum likelihood algorithm, relying on linearizing the Pl¨ ucker constraint and on a Pl¨ ucker correction procedure to compute the closest Pl¨ ucker coordinates to a given 6vector. Second, we consider the bundle adjustment problem. Previous overparameterizations of 3D lines induce gauge freedoms and/or internal consistency constraints. We propose the orthonormal representation, which allows handy nonlinear optimization of 3D lines using the minimum 4 parameters, within an unconstrained nonlinear optimizer. We compare our algorithms to existing ones on simulated and real data.
1. Introduction
The goal of this paper is to give methods for reconstruction of line features from image correspondences over multiple views, from initialization to ?nal bundle adjustment. Reconstruction of line features is an important topic since it is used in areas such as scene modeling, augmented reality and visual servoing. Bundle adjustment is the computation of an optimal visual reconstruction of camera motion and 3D scene structure, where optimal means of maximum likelihood in terms of reprojected image error. We make no assumption about the calibration of the cameras. We assume that line correspondences over at least three views are available1 . While the multipleview geometry of lines is wellunderstood, see e.g. [3, 5], there is still a need for practi1 line
correspondences over two views do not constrain the camera mo
tion.
cal structure and motion algorithms. The factorization algorithms [7, 9, 13] yield reliable results but requires that all lines are visible in all views. We focus on the common threestage approach, see e.g. [5, §17.5], consisting in (i) computing camera motion using interimage matching tensors, (ii) triangulating the features and (iii) running bundle adjustment. There exist reliable algorithms for step (i). In particular, it can be solved by computing trifocal tensors for triplets of consecutive images, using e.g. the automatic computation algorithm described in [5, §15.6], and registering the triplets in a manner similar to [4]. Other integrated motion estimation systems are [10], based on Kalman ?ltering techniques, and [14] based on registering each view in turn. In steps (ii) and (iii), one of the main dif?culties when dealing with line features arises: The algebraic representation. Indeed, there is no minimal, complete and globally nonsingular parameterization of the 4dimensional set of 3D lines, see e.g. [5, §2.2]. Hence, they are often overparameterized, e.g. as the join of two points or as the meet of two planes (8 parameters), or by the 6 coef?cients of their Pl¨ ucker coordinates, which must satisfy the bilinear Pl¨ ucker constraint. Another overparameterization is two images of the line (6 parameters). The most appropriate representation depends upon the problem considered. For example, the algorithm in [5, §15.2] shows that the ‘two image lines’ representation is welladapted to the computation of the trifocal tensor, while the sequential algorithm of [10] is based on Pl¨ ucker coordinates. Concerning step (ii), many of the previous works assume calibrated cameras, e.g. [6, 11, 12, 15] and use speci?c Euclidean representations. The linear three view algorithm of [15] and the algorithm of [12] utilize a ‘closest point+direction’ representation, while [11] uses the projections of the line on the x = 0 and the y = 0 planes, which has obvious singularities. These algorithms yield suboptimal results in that none of them maximizes the individual likelihood of the reconstructed lines. Bundle adjustment, step (iii), is a nonlinear procedure involving camera and line parameters, which maximizes the
likelihood of the reconstruction, corresponding to minimizing the reprojection error when the noise on measured features is assumed to have an identical and independent normal distribution. Previouslymentioned overparameterizations are not welladapted to standard nonlinear optimizers. The two point and the two plane overparameterizations have 4 degrees of internal gauge freedoms2 which may induce numerical instabilities. The ‘two image lines’ parameterization has 2 degrees of internal gauge freedoms and implies that one may have to choose different images for different lines since all lines may not be visible in all views. Also, one must check that the chosen images are not too close to each other. Finally, direct optimization of Pl¨ ucker coordinates makes sense only if a constrained optimization technique is used to enforce the bilinear Pl¨ ucker constraint. An appropriate representation would not involve internal constraint or gauge freedom. To summarize, there is a need for an ef?cient optimal triangulation algorithm, and a representation of 3D lines welladapted to nonlinear optimization. We address both of these problems through the following contributions. In §3, we propose triangulation methods based on using Pl¨ ucker coordinates to represent the lines. A simple and optimal algorithm is obtained based on linearizing the bilinear Pl¨ ucker constraint within an iteratively reweighted least squares approach. In §4, we propose a nonlinear representation of 3D lines that we call the orthonormal representation. This representation allows ef?cient nonlinear optimization since only the minimum 4 parameters are needed for optimization, within any unconstrained nonlinear optimizer. With this representation, there is no internal gauge freedom or consistency constraint, and analytic differentiation of the error function is possible. Finally, §5 validates our algorithms and compares them to existing ones. The next section gives some preliminaries and notations and formally states the problem.
The 2D orthogonal (Euclidean) distance between point q and line l weighted by q3 is:
T 2 2 2 d2 ⊥ (q, l) = (q l) /(l1 + l2 ).
(1)
Plucker ¨ line coordinates. Given two 3D points MT ? T ? ? T  n , one can represent the M  m and NT ? N line joining them by a homogeneous ‘Pl¨ ucker’ 6vector ? ×N ? and b = mN ? ? nM ?, LT ? aT  bT , where a = M see e.g. [5, §2.2]. Other conventions for Pl¨ ucker 6vectors are also possible. Each comes with a bilinear constraint that the 6vector must satisfy in order to represent valid line coordinates. For our de?nition, the constraint is: C (L) = 0 where C (L) = aT b. (2)
Perspective projection matrix for Plucker ¨ line coordinates. Given a standard (3 × 4) perspective projection ma?  p), a (3 × 6) matrix projecting Pl¨ trix P ? (P ucker line coordinates [2, 3] is given by: ? )P ? ?T  [p]× P ? ). P ? (det(P (3)
It can be easily derived by expanding the expression of the 2D line joining the projections of two points. Maximum likelihood estimation. As noted in [5, §15.7.2], no matter how many points are used to represent an image line l, the quadratic error function on it can be ex2 pressed in the form d2 ⊥ (x, l) + d⊥ (y, l) for two weighted points x, y on l. We will use this representation for simplicity. If we have 3D lines S = {L1 , . . . , Lm } and cameras M = {P1 , . . . , Pn }, the negative log likelihood function E (S , M) for the reconstruction, corresponding to the reprojection error, can be written in terms of individual reprojection errors E (Lj , M) for each line j :
m
2. Preliminaries and Notations
We make no formal distinction between coordinate vectors and physical entities. Everything is represented in homogeneous coordinates. Equality up to scale is denoted by ?, transposition and transposed inverse by T and ?T . Vectors are typeset using bold fonts (L, l), matrices using sansserif fonts (S, A, R) and scalars in italics. Bars represent inhomogeneous leading parts of vectors or matrices, e.g. ? T  m . The L2 norm of vector v is denoted MT ? M v . The identity matrix is denoted I. SO(2) and SO(3) denote the 2D and 3D rotation groups.
2 for the former one, the position of the points along the line, and the free scale factor of the homogeneous representation of these points.
E (S , M)
=
j =1 n
E (Lj , M)
(4)
E (Lj , M) =
i=1
ij ij 2 ij ij d2 ⊥ (x , l ) + d⊥ (y , l ) . (5)
3. Triangulation
This section discusses computation of structure given camera motion. We propose direct linear and iterative nonlinear methods to recover Pl¨ ucker line coordinates. These algorithms are general in the sense that they can be used with calibrated, partially calibrated or uncalibrated cameras. First, we describe a somehow trivial linear algorithm where a biased error function (compared to the reprojection 2
error) is minimized. This algorithm is subject to the same kind of drawback as the 8 point algorithm for computing the fundamental matrix: due to possible noise in the data, the resulting 6vectors do not generally satisfy the bilinear Pl¨ ucker constraint (2), similarly to the matrix computed by the 8 point algorithm not being rank de?cient [5, §10.2]. We propose what we call a Pl¨ ucker correction procedure, which allows to compute the closest Pl¨ ucker coordinates to a 6vector. Second, we propose an algorithm where the reprojection error of the line is minimized. The cornerstone of this algorithm is the linearization of the Pl¨ ucker constraint. Since the reconstruction of each line is independent from the others, we drop the j index in this section.
problem to an equivalent 2D problem, solve the 2D transformed problem and bring the solution back to the original 3D coordinate frame. A geometric interpretation. We interpret the 3vectors a, b, u and v as coordinates of 3D points. These points are not directly linked to the underlying 3D line. This interpretation is just intended to visualize the problem. The Pl¨ ucker constraint uT v corresponds to the fact that the lines induced by the origin with u and v are perpendicular. The correction criterion is the sum of squared Euclidean distances between a and u and between b and v. Hence, the problem may be formulated as ?nding two points u and v, the closest as possible to a and b respectively and such that the lines induced by the origin with u and v are perpendicular. We begin by rotating the coordinate frame such that a and b are transferred on the z = 0 plane. This is the reduction of the problem. We solve the reduced problem, by ?nding two points on the z = 0 plane, minimizing the correction criterion and satisfying the Pl¨ ucker constraint. Finally, we express the solution back to the original space. Reducing the problem. Let us de?ne the (3 × 2) matrices ? ? (a b) and C ? (u v). The Pl¨ C ucker constraint is ful?lled if and only if the columns of matrix C are orthogonal. We rewrite the correction criterion as : O = L?L
2
3.1. Linear Algorithm
We describe a linear algorithm, ‘LIN’. In the reprojection error (5), each term is based on the square of the 2D pointtoline orthogonal distance d⊥ , de?ned by equation (1). The denominator of this distance is the cause of the nonlinearity. Ignoring this denominator leads to an algebraic distance denoted by da , biased compared to the orthogonal distance. da is linear in the predicted line l and 2 2 T 2 de?ned by d2 a (q, l) = d⊥ (q, l) w = (q l) , where the 2 2 2 scalar factor w encapsulates the bias as w = l1 + l2 : (wi ) = (Pi L)1
2 2
+ (Pi L)2
2
.
(6)
? ? C 2. = C
We de?ne the biased linear least squares error function:
n
B (L, M) =
i=1
(xi Pi L)2 + (yi Pi L)2 ? ... ?xi T P i ? ? A(2n×6) L 2 , A = ? ?yi T Pi ? . ... ?
T
T
(7)
? (3×2) = Using the following singular value decomposition C ? (3×2) Σ ? (2×2) V ?T U : (2×2) ?Σ ?V ?T ? C O= U
2
?V ?T ? U ? T C 2, = Σ
=
(8)
Since L is an homogeneous vector, we add the constraint L 2 = 1. The L that minimizes B (L, M) is then given by the singular vector of A associated to its smallest singular value. Due to noise, the recovered 6vector does not in general satisfy the Pl¨ ucker constraint (2).
? has orthonormal columns. We de?ne Z ? = Σ ?V ?T since U T ? is orthonormal and Σ ? is diagonal, ? C. Matrix V and Z = U ? are orthogonal (i.e. Z ?Z ? T is diagonal, hence the rows of Z ?TZ ? ). Note that Z = U ? T C implies C = U ? Z, even if but not Z T 3 ? ? UU is not the identity . The problem is reduced to ?nding a columnorthogonal4 matrix Z, as close as possible to the ?. roworthogonal matrix Z
3 Indeed, denote u the columns of matrix U ? and form U = i ? = ( I(2×2) 0(2×1) )T . Let us mul(u1 u2 u1 × u2 ). We have UT U ?Σ ? 0(2×1) )T ? UT C 2 . tiply the correction criterion by UT : O = ( V
¨ 3.2. Plucker Correction
Let LT ? (aT  bT ) be a 6vector that does not necessarily satisfy the Pl¨ ucker constraint (2), i.e. aT b might be nonzero. We seek LT ? (uT  vT ), de?ned by minL,uT v=0 L ? L 2 . Although this problem has a clear and concise formulation, it is not trivial. We propose the following solution, summarized in a practical manner in table 1. We transform the original 3D 3
Denote Y(3×2) = UT C. The optimal solution has the form Y T = ( ZT 0(2×1) ), since, according to the geometric interpretation, the corrected points u and v must lie on the plane de?ned by points a, b and the ?Y ?. origin, the plane z = 0. Therefore, we obtain C = UY = U 4 The fact that matrix Z = U ? T C is columnorthogonal is induced from the Pl¨ ucker constraint. Indeed, this constraint implies that C is columnorthogonal, hence CT C is diagonal. Matrix UT C, where SO (3) U = ? u ? ), is also columnorthogonal. Observe that (u1 u2 u1 × u2 ) = (U ?U ? T C since u ?U ? T C + CT u ? T C = 0T . ?u ? T C = CT U CT UUT C = CT U ? T C is columnorthogonal. Hence, matrix U
Solving the reduced problem. We parameterize the columnorthogonal matrix Z as Z = VΣ, where V is orthonormal and Σ is diagonal. Hence : ?V ? T ? VΣ O= Σ
2
3.3. QuasiLinear Algorithms
We describe algorithms ‘QLIN 1’ and ‘QLIN 2’, that consider the reprojection error (5). They are based on an iterative biascorrection, through reweighting of the biased error function (7). Such algorithms are coined quasilinear. We showed previously that the orthogonal and the algebraic distances are related by a scalar factor, given by equation (6), depending on the 3D line. The reprojection error and the biased error functions are therefore related by a set of such factors, one for each image of the line. The fact that these factors depend on the unknown 3D line suggests an iterative reweighting scheme. The ?rst approach that comes to mind is ‘QLIN 1’. The linear system considered for method LIN is formed and solved. The resulting 6vector L0 is corrected to be valid Pl¨ ucker coordinates. This yields a biased estimate of the 3D line. Using this estimate, weight factors that contain the bias of the linear least squares error function are computed, and used to reweight the equations. The process is iterated to compute successive re?ned estimates Lk until convergence, where k is the iteration counter. Convergence is determined by thresholding the difference between two consecutive errors. It is typically reached in 3 or 4 iterations. Experimental results show that this naive approach performs very badly, see §5. This is due to the fact that the Pl¨ ucker constraint is enforced afterhand and is not taken into account while solving the linear least squares system. To remedy to this problem, we propose ‘QLIN 2’, that linearizes and enforces the Pl¨ ucker constraint (2), as follows. The algorithm is summarized in table 2. Rewrite the I constraint as C (L) = LT GL where G(6×6) = ( 0 I 0 ). By expanding this expression to ?rst order around the estimate Lk , and after some minor algebraic manipulations, we obtain the following linear constraint on Lk+1 : Ck (Lk+1 ) = LT k GLk+1 = 0. We follow the constrained linear least squares optimization method summarized in [5, §A3.4.3] to enforce this linearized constraint, as well as Lk+1 = 1. The idea is to ?nd an orthonormal basis of all possible vectors satisfying the constraint and to solve for a 5vector γ expressed in this basis. Such an orthonormal basis is provided by ? computing the nullspace of LT k G using SVD . Let V be a (6 × 5) orthonormal matrix whose columns span the ba? ? sis (i.e. LT k GV = 0), we de?ne Lk+1 = Vγ , hence T ? Ck (Lk+1 ) = Lk GVγ = 0 and Lk+1 = γ . We solve for γ be substituting in equation (8) ( ALk+1 2 = ? γ 2 ). The singular vector associated to the smallest AV ? provides the solution vector singular value of matrix AV with unit normtwo such that B (Lk+1 , M) is minimized. 4
?V ? T ? Σ 2. = VT Σ
The diagonal matrix Σ which minimizes this expression ?V ? T , and does not is given by the diagonal entries of VT Σ depend on the solution for V. The orthonormal matrix V = (v1 v2 ) is given by minimizing the sum of squares of ? , with Z ?=Σ ?V ? T = (z1 z2 ) : the offdiagonal entries of VT Z
T T O = ( v1 z 2 ) 2 + ( v2 z 1 )2 . 0 ?1 and De?ne the rotation matrix with angle π/2 M = 1 0 parameterize the orthonormal matrix V by a unit vector v, as : v1 = v v2 = M v ,
The correction criterion can be rewritten as : O = ( v T z 2 ) 2 + ( v T MT z 1 ) 2 = T v
2
avec T =
zT 2 . zT 1M
The unit vector v minimizing this expression is given by the singular vector associated to the smallest singular value of matrix T. Expressing the solution. From vector v which solves the reduced problem, we form the orthonormal matrix v ?1 ?v ?2 V = v ?2 v ?1 . The diagonal matrix Σ is given by Σ = T ? ?T diag(V ΣV ). Table 1 summarizes this algorithm. ? Compute the singular value decomposition ?Σ ?V ?T. (a b) = U z21 z22 ?=Σ ?V ? T , form matrix T = ( z ? Let Z ?z ).
12 11
? Compute the singular vector v associated to the smallest singular value of matrix T. ?1 ?v ?2 ?= v ? Let V v ?2 v ?1 , we obtain: ? V diag VT Σ ?V ?T . (u v) ? U
Table 1. The Pl¨ ucker correction algorithm. Given a 6vector LT ? (aT  bT ), this algorithm computes the closest Pl¨ ucker coordinates LT ? (uT  vT ), i.e. T u v = 0, in the sense of the L2 norm, i.e. L ? L 2 is minimized.
1. Initialization: Form the linear least squares system A from equation (8), compute L0 by minimizing AL0 2 , see §3.1, and by applying the Pl¨ ucker correction procedure described in §3.2. Set k = 0. 2. Constraint linearization: Compute the SVD LT kG ? T T ? u diag(1, 0, 0, 0, 0, 0)(v(6×1)  V(6×5) ) . ? γ 2 and set 3. Estimation: Compute minγ , γ 2 =1 AV ? γ. Lk+1 = V 4. Biascorrection: Reweight the linear system A by computing the weights according to equation (6). 5. Iteration: Iterate steps 2, 3 and 4 until convergence. Table 2. Quasilinear algorithm ‘QLIN 2’ for optimal triangulation.
implies that the representation is wellconditioned. Based on such a representation, local update using the minimum number of parameters is possible. Commonly used nonlinear optimizers, e.g. Newtontype optimizers such as LevenbergMarquardt, often require the Jacobian matrix of the error function with respect to the update parameters. In the orthonormal representation framework, we split it as the product of the Jacobian matrix of the error function considered with respect to the ‘standard’ entity representation, e.g. the fundamental matrix or Pl¨ ucker coordinates, and the orthonormal Jacobian matrix, i.e. for the ‘standard’ representation with respect to the update parameters. Example: representing P1 . We derive the orthonormal representation of the 1dimensional projective space P1 . This is used in §4.3 to derive the orthonormal representation of 3D lines. Let σ ∈ P1 . Such a 2vector is de?ned up to scale and has therefore only 1 degree of freedom. We represent it by an SO(2) matrix W de?ned by: W= 1 σ σ1 σ2 ?σ2 . σ1 (9)
4. Bundle Adjustment
Bundle adjustment is the nonlinear minimization of the reprojection error (4), over camera and line parameters. We focus on the parameterization of 3D lines. Parameterizing the camera motion has been addressed in e.g. [1, 5, §A4.6].
4.1. Problem Statement
As said in the introduction, there are various possibilities to overparameterize the 4dimensional set of 3D lines. In the context of nonlinear optimization, choosing an overparameterized representation may induce the following problems. First, the computational cost of each iteration is increased by super?uous parameters. Second, arti?cial freedoms in the parameter set (internal gauge freedoms) are induced and may give rise to numerical instabilities. Third, some internal consistency constraints, such as the Pl¨ ucker constraint, may have to be enforced. These reasons motivate the need for a representation of 3D lines allowing nonlinear optimization with the minimum 4 parameters. In that case, there is no free scale induced by homogeneity or internal consistency constraints, and any unconstrained nonlinear optimizer can be used.
The ?rst column of this matrix is σ itself, normalized to unitnorm. Let θ be the update parameter. A local update step is W ← WR(θ) where R(θ) is the 2D rotation matrix σ of angle θ. The Jacobian matrix ? ?θ evaluated at θ0 = 0 (the update is with respect to a base rotation) is given by: ?σ ?θ =
θ0
? w1 ?θ
=
θ0
?σ 2 σ1
= w2 ,
(10)
where wi is the ith column of W. Updating SO(3). A matrix U ∈ SO(3) can be locally updated using 3 parameters by any wellbehaved (locally nonsingular) representation, such as 3 Euler angles θ T = (θ1  θ2  θ3 ) as: U ← UR(θ) with R(θ) = Rx (θ1 )Ry (θ2 )Rz (θ3 ), (11) where Rx (θ1 ), Ry (θ2 ) and Rz (θ3 ) are SO(3) matrices representating 3D rotations around the x, y  and z axis of angle θ1 , θ2 and θ3 respectively. The Jacobian matrix is derived as follows. As in the SO(2) case, the update is with respect to a base rotation. The orthonormal Jacobian matrix is therefore evaluated at θ 0 = 0(3×1) : ?U ?θ =
θ0
4.2. The Orthonormal Representation
The orthonormal representation has been introduced in [1] for the nonlinear optimization of the fundamental matrix with the minimum 7 parameters. It consists in ?nding a representation involving elements of SO(n) and scalars (hence the term ‘orthonormal representation’). In particular, no other algebraic constraints should be necessary, such as the ranktwo constraint of fundamental matrices or the bilinear Pl¨ ucker constraint. Using orthonormal matrices 5
?U ?θ1

θ0
?U ?θ2

θ0
?U ?θ3
.
θ0
After minor algebraic manipulations, we obtain: ?U ?θ1 =
θ0
? (URx (θ1 )Ry (θ2 )Rz (θ3 )) ?θ1
θ0
= (03  u3  ? u2 ) ,
(12)
where ui is the ith column of U. Similarly: ?U ?θ2 ?U ?θ3 = (?u3  03  u1 )
θ0
(13) (14)
= ( u 2  ? u 1  03 ) .
θ0
SO(3), the two nonzero entries of Σ de?ned up to scale can be represented by an SO(2) matrix W, as shown in §4.2. Going back from the orthonormal representation to Pl¨ ucker coordinates is trivial. The Pl¨ ucker coordinates of the line are obtained from its orthonormal representation (U, W) as: T LT ? (w11 uT (15) 1  w21 u2 ), where ui is the ith column of U. A 4parameter update. Consider the orthonormal representation (U, W) ∈ SO(3) × SO(2) of a given 3D line. Since U ∈ SO(3), as reviewed in §4.2, it can not be minimally parameterized but can be locally updated using equation (11), as U ← UR(θ) where θ ∈ R3 . Matrix W ∈ SO(2) can be updated as W ← WR(θ), where θ ∈ R. We de?ne the update parameters by the 4vector pT ? (θT  θ). Initialization. The initial guess is given by the Pl¨ ucker T T coordinates LT 0 ? (a0  b0 ). ? Compute the orthonormal representation (U, W) ∈ SO(3) × SO(2) of L0 by QR decomposition : (a0  b0 ) = U
σ1 σ2
These expressions are vectorized to form the orthonormal Jacobian matrix.
4.3. The Case of 3D Lines
The case of 3D lines is strongly linked with the cases of SO(2) and SO(3), as shown by the following result: Any (projective) 3D line L can be represented by: (U, W) ∈ SO(3) × SO(2), where SO(2) and SO(3) are the Lie groups of respectively (2 × 2) and (3 × 3) rotation matrices. (U, W) is the orthonormal representation of the 3D line L. The proof of this result is obtained by showing that any 3D line has an orthonormal representation (U, W) ∈ SO(3) × SO(2), while any (U, W) ∈ SO(3) × SO(2) corresponds to a unique 3D line. The next paragraph illustrates this by means of Pl¨ ucker coordinates. Note that this result is consistent with the fact that a 3D line has 4 degrees of freedom, since an element of SO(2) has one degree of freedom and an element of SO(3) has 3 degrees of freedom. Using this representation of 3D lines, we show that there exists a locally nonsingular minimal parameterization. Therefore, 3D lines can be locally updated with the minimum 4 parameters. The update scheme is inspired from those given above for 2D and 3D rotation matrices, and can be plugged into most of the existing nonlinear optimization techniques. These results are summarized in table 3 in a practical manner. Relating Plucker ¨ coordinates and the orthonormal representation. The orthonormal representation of a 3D line can be computed from its Pl¨ ucker coordinates LT ? T T ? (a  b ), as follows. Let C(3×2) ? (a  b) be factored as : ? ? a a b a × b ?? ? b ?. C a b a×b
SO (3) ( a b )T ∈P1
and set W =
σ1 ?σ2 σ2 σ1
.
? The 4 optimization parameters are pT = (θT  θ) where the 3vector θ and the scalar θ are used to update U and W respectively. Update. (i.e. one optimization step)
T ? Current line is LT ? (w11 uT 1  w21 u2 ) and ? L/? p is given by equation (16). ? Compute p by minimizing some criterion. ? Update U and W: U ← UR(θ ) and W ← WR(θ).
Table 3. Elements for 3D line optimization using the minimal 4 parameters through the orthonormal representation. We denote J the (6 × 4) Jacobian matrix of the Pl¨ ucker coordinates, with respect to the orthonormal representation. Matrix J must be evaluated at p0 = 0(4×1) : Jp0 = ?L ?θ1 
p0
?L ?θ2

p0
?L ?θ3

p0
?L ?θ
.
p0
? (3×2) = In practice, we use QR decomposition, C U(3×3) Σ(3×2) . As already mentioned the special form of matrix Σ is due to the Pl¨ ucker constraint. While U ∈ 6
By using the orthonormal representation to Pl¨ ucker coordinates equation (15) and the Jacobian matrices for SO(2) and SO(3), as de?ned by equations (10,12,13,14), we obtain, after minor algebraic manipulations: J(6×4) = 0(3×1) σ2 u 3 ?σ1 u3 0(3×1) σ1 u 2 ?σ2 u1 ?σ2 u1 . (16) σ1 u 2
Geometric interpretation. Each of the 4 abovede?ned update parameters p has a geometric interpretation. Since matrix W encapsulates the distance d from the origin O to L, parameter θ acts on d. Matrix U is related to a 3D coordinate frame attached to L. Parameter θ1 rotates L around a circle with radius d, centered on O, and lying on the plane de?ned by O and L. Parameter θ2 rotates L around a circle PSfrag replacements with radius d, centered on O, and lying in a plane containing O, the closest point Q of L to O, and perpendicular to L. Parameter θ3 rotates L around the axis de?ned by O and Q. For the last three cases, the angles of rotation are the parameters themselves. This interpretation allows to easily incorporate a priori knowledge while estimating a line. For example, to leave the direction of the line invariant, one may use the 2 update parameters θ2 and θ, while to leave the distance of the line to the origin invariant, one may use the 3 update parameters θ.
4
3.5
3
Estimation error
LIN QLIN 1 QLIN 2 MLE MLE HARTLEY LOWER BOUND
2.5
2
1.5
1
0.5
0
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Noise standard deviation (pixels)
(a)
4 3.5
3
Estimation error
5. Experimental Results
5.1. Simulated Data
PSfrag replacements
LIN QLIN 1 QLIN 2 MLE MLE HARTLEY LOWER BOUND
2.5
2
1.5
Our simulated experimental setup consists of a set of cameras looking inwards at 3D lines randomly chosen in a sphere. Cameras are spread widely around the sphere. We ?x the focal length of the cameras to 1000 (in number of pixels). Note that this information is not used in the rest of the experiments. The endpoints of all lines are projected in all views, where their positions are corrupted by an additive Gaussian noise. We vary the parameters of this setup to assess and compare the quality of the different estimators on various scene con?gurations. We compare the 4 methods given in this paper: LIN, QLIN 1, QLIN 2 and MLE (bundle adjustment based on our orthonormal representation of 3D lines), as well as the method given in [5, §15.4.1], denoted by ‘MLE HARTLEY’. This method consists in nonlinearly computing the trifocal tensor as well as reconstructed lines by minimizing the reprojection error (4) and parameterizing the 3D lines by two of their three images. We also compare QLIN 2 to a direct LevenbergMarquardtbased minimization of the reprojection error: These two methods give undistinguishable results in all our experiments. Note that most existing methods, e.g. [6, 11, 12, 15] can be applied only when camera calibration is available. We measure the quality of an estimate using the estimation error, as described in [5, §4], which also provides the theoretical lower bound. The estimation error is equivalent to the value of the negative log likelihood (4) (i.e. the reprojection error). We vary the added noise level from 0 to 2 pixels, while considering 20 lines and 3 views. The result is shown on ?gure 1 (a). One observes that, beyond 1 pixel noise, methods 7
1
0.5
0 15
20
25
30
35
40
45
50
55
60
Number of line correspondences
(b) Figure 1. Estimation error for different methods when varying the variance of added noise on image endpoints (a) and the number of lines considered (b).
LIN and QLIN 1 behave very badly. This is mainly due to the bias introduced by the Pl¨ ucker correction procedure. Methods QLIN 2, MLE and MLE HARTLEY degrade gracefully as the noise level increases. Method QLIN 2 gives reasonable results. Methods MLE and MLE HARTLEY give undistinguishable results, very close to the theoretical lower bound. We vary the number of lines from 15 to 60, while considering a 1 pixel noise and 3 views. The result is shown on ?gure 1 (b). Similar conclusions as for the previous experiment can be drawn, except for the fact, that when more than 30 lines are considered, methods LIN and QLIN 1 give reasonable results. Also, methods MLE and MLE HARTLEY give results undistinguishable from the theoretical lower bound when more than 45 lines are considered. Another experiment, not shown here due to lack of space, shows that when the number of views increases, the estimation error decreases for all compared methods. Note
that method MLE HARTLEY can not be used with more than three views and is therefore not concerned with these conclusions. We observe that beyond 10 views, the result of method MLE is undistinguishable from the theoretical lower bound. The results given by methods LIN and QLIN 1 are reasonable when more than 15 views are considered. We observe that the quasilinear methods always converge within 5 iterations.
5.2. Real Data
We tested our algorithms on several image sequences. For one of them, see ?gure 2, we show results. We provide
LIN
& QLIN 1
Figure 2. Sample images out of a 5frame indoor sequence overlaid with manuallyprovided lines. Note that the optical distortion is not corrected. 45 line correspondences by hand. Note that some of them are visible in two views only. We use these line correspondences to compute the trifocal tensor corresponding to each subsequence formed by a triplet of consecutive images, using the linear method described in e.g. [5, §15.2]. We use method QLIN 2 to reconstruct the lines associated with each triplet. We registered these subsequences by using the method given in [2]. At this point, we have a suboptimal guess of metric structure and motion. We further re?ne it using our structure from motion algorithms, to reconstruct each line by taking into account all of its images. The corresponding estimation errors are, respectively for LIN, QLIN 1 and QLIN 2, 2.3, 1.9 and 1.4 pixels. We used the result of QLIN 2 to initialize our maximum likelihood estimator for structure and motion based on the proposed orthonormal representation together with a metric parameterization of the camera motion, which ends up with a 0.9 pixel estimation error. For each estimation, we reconstruct the endpoints corresponding to the ?rst view (shown on the left of ?gure 2). The maximum likelihood endpoints are given by orthogonally projecting their images onto the image of the corresponding line. These results are visible on ?gure 3. Note the signi?cant improvement of methods QLIN 2 and MLE over methods LIN 8
QLIN 2
MLE
Figure 3. Zoom on some original (white) and reprojected lines (black).
Figure 4. Snapshots of the cameras and lines reconstructed by method MLE. The images shown in ?gure 2 correspond to the top and bottommost cameras.
and QLIN 1. The lines predicted by MLE and the original lines are undistinguishable. Figure 4 shows the cameras and lines reconstructed by MLE. There is visually no difference with the reconstruction provided by algorithm QLIN 2, but that reconstructions provided by LIN and QLIN 1 appear distorted.
usable with any number of views. Most algorithms proposed in this paper are summarized in a practical manner.
References
[1] A. Bartoli. On the nonlinear optimization of projective motion using minimal parameters. In Proceedings of the 7th European Conference on Computer Vision, Copenhagen, Denmark, volume 2, pages 340–354, May 2002. [2] A. Bartoli and P. Sturm. The 3D line motion matrix and alignment of line reconstructions. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Kauai, Hawaii, USA, volume I, pages 287–292. IEEE Computer Society Press, December 2001. [3] O. Faugeras and B. Mourrain. On the geometry and algebra of the point and line correspondences between n images. In Proceedings of the 5th International Conference on Computer Vision, Cambridge, Massachusetts, USA, pages 951– 956, June 1995. [4] A. Fitzgibbon and A. Zisserman. Automatic camera recovery for closed or open image sequences. In European Conference on Computer Vision, pages 311–326, june 1998. [5] R. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, June 2000. [6] Y. Liu and T. Huang. A linear algorithm for motion estimation using straight line correspondences. Computer Vision, Graphics and Image Processing, 44(1):35–57, October 1988. [7] D. Martinec and T. Pajdla. Line reconstruction from many perspective images by factorization. In Proceedings of the Conference on Computer Vision and Pattern Recognition, Madison, Wisconsin, USA, volume I, pages 497–502. IEEE Computer Society Press, June 2003.
6. Conclusion
We addressed the problem of structure and motion recovery from line correspondences across multiple views. First, we proposed an optimal triangulation algorithm. Given camera motion, the Pl¨ ucker coordinates of the lines are estimated by minimizing the reprojection error. The algorithm relies on an iteratively reweighted least squares scheme. We linearized the bilinear Pl¨ ucker constraint to incorporate it up to ?rst order in the estimation process. A Pl¨ ucker correction procedure is proposed to ?nd the nearest Pl¨ ucker coordinates to a given 6vector. Second, we proposed the orthonormal representation of 3D lines, which allows nonlinear optimization with the minimal 4 parameters within any unconstrained nonlinear optimizer, contrarily to previously proposed overparameterizations. This representation is wellconditioned and allows analytic differentiation. Experimental results on simulated and real data show that the standard linear method and its naive biascorrected extension perform very badly in many cases and should only be used to initialize a nonlinear estimator. Our biascorrected algorithm including the Pl¨ ucker constraint performs as well as direct LevenbergMarquardtbased triangulation. It is therefore a good solution to initialize subsequent bundle adjustment. Based on our orthonormal representation, bundle adjustment gives results close to the theoretical lower bound and undistinguishable from the threeview maximum likelihood estimator of [5, §15.4.1], while being 9
[8] M. Pollefeys, R. Koch, and L. Van Gool. Selfcalibration and metric reconstruction in spite of varying and unknown internal camera parameters. International Journal of Computer Vision, 32(1):7–25, 1999. [9] L. Quan and T. Kanade. Af?ne structure from line correspondences with uncalibrated af?ne cameras. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(8):834– 845, August 1997. [10] Y. Seo and K. S. Hong. Sequential reconstruction of lines in projective space. In Proceedings of the 13th International Conference on Pattern Recognition, Vienna, Austria, pages 503–507, August 1996. [11] M. Spetsakis and J. Aloimonos. Structure from motion using line correspondences. International Journal of Computer Vision, 4:171–183, 1990. [12] C. Taylor and D. Kriegman. Structure and motion from line segments in multiple images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(11):1021–1032, November 1995. [13] B. Triggs. Factorization methods for projective structure and motion. In Proceedings of the Conference on Computer Vision and Pattern Recognition, San Francisco, California, USA, pages 845–851, 1996. [14] T. Vi? eville, Q. Luong, and O. Faugeras. Motion of points and lines in the uncalibrated case. International Journal of Computer Vision, 17(1), 1995. [15] J. Weng, T. Huang, and N. Ahuja. Motion and structure from line correspondences: Closedform solution, uniqueness, and optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(3):318–336, 1992.
10