Commit 052b398a authored by Linus Torvalds's avatar Linus Torvalds
Browse files

Merge branch 'for-linus-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs updates from Al Viro:
 "In this pile: pathname resolution rewrite.

   - recursion in link_path_walk() is gone.

   - nesting limits on symlinks are gone (the only limit remaining is
     that the total amount of symlinks is no more than 40, no matter how
     nested).

   - "fast" (inline) symlinks are handled without leaving rcuwalk mode.

   - stack footprint (independent of the nesting) is below kilobyte now,
     about on par with what it used to be with one level of nested
     symlinks and ~2.8 times lower than it used to be in the worst case.

   - struct nameidata is entirely private to fs/namei.c now (not even
     opaque pointers are being passed around).

   - ->follow_link() and ->put_link() calling conventions had been
     changed; all in-tree filesystems converted, out-of-tree should be
     able to follow reasonably easily.

     For out-of-tree conversions, see Documentation/filesystems/porting
     for details (and in-tree filesystems for examples of conversion).

  That has sat in -next since mid-May, seems to survive all testing
  without regressions and merges clean with v4.1"

* 'for-linus-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (131 commits)
  turn user_{path_at,path,lpath,path_dir}() into static inlines
  namei: move saved_nd pointer into struct nameidata
  inline user_path_create()
  inline user_path_parent()
  namei: trim do_last() arguments
  namei: stash dfd and name into nameidata
  namei: fold path_cleanup() into terminate_walk()
  namei: saner calling conventions for filename_parentat()
  namei: saner calling conventions for filename_create()
  namei: shift nameidata down into filename_parentat()
  namei: make filename_lookup() reject ERR_PTR() passed as name
  namei: shift nameidata inside filename_lookup()
  namei: move putname() call into filename_lookup()
  namei: pass the struct path to store the result down into path_lookupat()
  namei: uninline set_root{,_rcu}()
  namei: be careful with mountpoint crossings in follow_dotdot_rcu()
  Documentation: remove outdated information from automount-support.txt
  get rid of assorted nameidata-related debris
  lustre: kill unused helper
  lustre: kill unused macro (LOOKUP_CONTINUE)
  ...
parents b953c0d2 b853a161
......@@ -50,8 +50,8 @@ prototypes:
int (*rename2) (struct inode *, struct dentry *,
struct inode *, struct dentry *, unsigned int);
int (*readlink) (struct dentry *, char __user *,int);
void * (*follow_link) (struct dentry *, struct nameidata *);
void (*put_link) (struct dentry *, struct nameidata *, void *);
const char *(*follow_link) (struct dentry *, void **);
void (*put_link) (struct inode *, void *);
void (*truncate) (struct inode *);
int (*permission) (struct inode *, int, unsigned int);
int (*get_acl)(struct inode *, int);
......
Support is available for filesystems that wish to do automounting support (such
as kAFS which can be found in fs/afs/). This facility includes allowing
in-kernel mounts to be performed and mountpoint degradation to be
requested. The latter can also be requested by userspace.
Support is available for filesystems that wish to do automounting
support (such as kAFS which can be found in fs/afs/ and NFS in
fs/nfs/). This facility includes allowing in-kernel mounts to be
performed and mountpoint degradation to be requested. The latter can
also be requested by userspace.
======================
IN-KERNEL AUTOMOUNTING
======================
A filesystem can now mount another filesystem on one of its directories by the
following procedure:
(1) Give the directory a follow_link() operation.
When the directory is accessed, the follow_link op will be called, and
it will be provided with the location of the mountpoint in the nameidata
structure (vfsmount and dentry).
(2) Have the follow_link() op do the following steps:
(a) Call vfs_kern_mount() to call the appropriate filesystem to set up a
superblock and gain a vfsmount structure representing it.
(b) Copy the nameidata provided as an argument and substitute the dentry
argument into it the copy.
(c) Call do_add_mount() to install the new vfsmount into the namespace's
mountpoint tree, thus making it accessible to userspace. Use the
nameidata set up in (b) as the destination.
If the mountpoint will be automatically expired, then do_add_mount()
should also be given the location of an expiration list (see further
down).
(d) Release the path in the nameidata argument and substitute in the new
vfsmount and its root dentry. The ref counts on these will need
incrementing.
See section "Mount Traps" of Documentation/filesystems/autofs4.txt
Then from userspace, you can just do something like:
......@@ -61,17 +35,18 @@ AUTOMATIC MOUNTPOINT EXPIRY
===========================
Automatic expiration of mountpoints is easy, provided you've mounted the
mountpoint to be expired in the automounting procedure outlined above.
mountpoint to be expired in the automounting procedure outlined separately.
To do expiration, you need to follow these steps:
(3) Create at least one list off which the vfsmounts to be expired can be
hung. Access to this list will be governed by the vfsmount_lock.
(1) Create at least one list off which the vfsmounts to be expired can be
hung.
(4) In step (2c) above, the call to do_add_mount() should be provided with a
pointer to this list. It will hang the vfsmount off of it if it succeeds.
(2) When a new mountpoint is created in the ->d_automount method, add
the mnt to the list using mnt_set_expiry()
mnt_set_expiry(newmnt, &afs_vfsmounts);
(5) When you want mountpoints to be expired, call mark_mounts_for_expiry()
(3) When you want mountpoints to be expired, call mark_mounts_for_expiry()
with a pointer to this list. This will process the list, marking every
vfsmount thereon for potential expiry on the next call.
......
......@@ -483,3 +483,20 @@ in your dentry operations instead.
--
[mandatory]
->aio_read/->aio_write are gone. Use ->read_iter/->write_iter.
---
[recommended]
for embedded ("fast") symlinks just set inode->i_link to wherever the
symlink body is and use simple_follow_link() as ->follow_link().
--
[mandatory]
calling conventions for ->follow_link() have changed. Instead of returning
cookie and using nd_set_link() to store the body to traverse, we return
the body to traverse and store the cookie using explicit void ** argument.
nameidata isn't passed at all - nd_jump_link() doesn't need it and
nd_[gs]et_link() is gone.
--
[mandatory]
calling conventions for ->put_link() have changed. It gets inode instead of
dentry, it does not get nameidata at all and it gets called only when cookie
is non-NULL. Note that link body isn't available anymore, so if you need it,
store it as cookie.
......@@ -350,8 +350,8 @@ struct inode_operations {
int (*rename2) (struct inode *, struct dentry *,
struct inode *, struct dentry *, unsigned int);
int (*readlink) (struct dentry *, char __user *,int);
void * (*follow_link) (struct dentry *, struct nameidata *);
void (*put_link) (struct dentry *, struct nameidata *, void *);
const char *(*follow_link) (struct dentry *, void **);
void (*put_link) (struct inode *, void *);
int (*permission) (struct inode *, int);
int (*get_acl)(struct inode *, int);
int (*setattr) (struct dentry *, struct iattr *);
......@@ -436,16 +436,18 @@ otherwise noted.
follow_link: called by the VFS to follow a symbolic link to the
inode it points to. Only required if you want to support
symbolic links. This method returns a void pointer cookie
that is passed to put_link().
symbolic links. This method returns the symlink body
to traverse (and possibly resets the current position with
nd_jump_link()). If the body won't go away until the inode
is gone, nothing else is needed; if it needs to be otherwise
pinned, the data needed to release whatever we'd grabbed
is to be stored in void * variable passed by address to
follow_link() instance.
put_link: called by the VFS to release resources allocated by
follow_link(). The cookie returned by follow_link() is passed
to this method as the last parameter. It is used by
filesystems such as NFS where page cache is not stable
(i.e. page that was installed when the symbolic link walk
started might not be in the page cache at the end of the
walk).
follow_link(). The cookie stored by follow_link() is passed
to this method as the last parameter; only called when
cookie isn't NULL.
permission: called by the VFS to check for access rights on a POSIX-like
filesystem.
......
......@@ -189,22 +189,7 @@ static inline int ll_quota_off(struct super_block *sb, int off, int remount)
#endif
/*
* After 3.1, kernel's nameidata.intent.open.flags is different
* with lustre's lookup_intent.it_flags, as lustre's it_flags'
* lower bits equal to FMODE_xxx while kernel doesn't transliterate
* lower bits of nameidata.intent.open.flags to FMODE_xxx.
* */
#include <linux/version.h>
static inline int ll_namei_to_lookup_intent_flag(int flag)
{
#if LINUX_VERSION_CODE >= KERNEL_VERSION(3, 1, 0)
flag = (flag & ~O_ACCMODE) | OPEN_FMODE(flag);
#endif
return flag;
}
#include <linux/fs.h>
# define ll_umode_t umode_t
......
......@@ -57,12 +57,6 @@
#define VM_FAULT_RETRY 0
#endif
/* Kernel 3.1 kills LOOKUP_CONTINUE, LOOKUP_PARENT is equivalent to it.
* seem kernel commit 49084c3bb2055c401f3493c13edae14d49128ca0 */
#ifndef LOOKUP_CONTINUE
#define LOOKUP_CONTINUE LOOKUP_PARENT
#endif
/** Only used on client-side for indicating the tail of dir hash/offset. */
#define LL_DIR_END_OFF 0x7fffffffffffffffULL
#define LL_DIR_END_OFF_32BIT 0x7fffffffUL
......
......@@ -118,7 +118,7 @@ failed:
return rc;
}
static void *ll_follow_link(struct dentry *dentry, struct nameidata *nd)
static const char *ll_follow_link(struct dentry *dentry, void **cookie)
{
struct inode *inode = d_inode(dentry);
struct ptlrpc_request *request = NULL;
......@@ -126,32 +126,22 @@ static void *ll_follow_link(struct dentry *dentry, struct nameidata *nd)
char *symname = NULL;
CDEBUG(D_VFSTRACE, "VFS Op\n");
/* Limit the recursive symlink depth to 5 instead of default
* 8 links when kernel has 4k stack to prevent stack overflow.
* For 8k stacks we need to limit it to 7 for local servers. */
if (THREAD_SIZE < 8192 && current->link_count >= 6) {
rc = -ELOOP;
} else if (THREAD_SIZE == 8192 && current->link_count >= 8) {
rc = -ELOOP;
} else {
ll_inode_size_lock(inode);
rc = ll_readlink_internal(inode, &request, &symname);
ll_inode_size_unlock(inode);
}
ll_inode_size_lock(inode);
rc = ll_readlink_internal(inode, &request, &symname);
ll_inode_size_unlock(inode);
if (rc) {
ptlrpc_req_finished(request);
request = NULL;
symname = ERR_PTR(rc);
return ERR_PTR(rc);
}
nd_set_link(nd, symname);
/* symname may contain a pointer to the request message buffer,
* we delay request releasing until ll_put_link then.
*/
return request;
*cookie = request;
return symname;
}
static void ll_put_link(struct dentry *dentry, struct nameidata *nd, void *cookie)
static void ll_put_link(struct inode *unused, void *cookie)
{
ptlrpc_req_finished(cookie);
}
......
......@@ -149,8 +149,6 @@ extern int v9fs_vfs_unlink(struct inode *i, struct dentry *d);
extern int v9fs_vfs_rmdir(struct inode *i, struct dentry *d);
extern int v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
struct inode *new_dir, struct dentry *new_dentry);
extern void v9fs_vfs_put_link(struct dentry *dentry, struct nameidata *nd,
void *p);
extern struct inode *v9fs_inode_from_fid(struct v9fs_session_info *v9ses,
struct p9_fid *fid,
struct super_block *sb, int new);
......
......@@ -1224,100 +1224,43 @@ ino_t v9fs_qid2ino(struct p9_qid *qid)
}
/**
* v9fs_readlink - read a symlink's location (internal version)
* v9fs_vfs_follow_link - follow a symlink path
* @dentry: dentry for symlink
* @buffer: buffer to load symlink location into
* @buflen: length of buffer
*
* @cookie: place to pass the data to put_link()
*/
static int v9fs_readlink(struct dentry *dentry, char *buffer, int buflen)
static const char *v9fs_vfs_follow_link(struct dentry *dentry, void **cookie)
{
int retval;
struct v9fs_session_info *v9ses;
struct p9_fid *fid;
struct v9fs_session_info *v9ses = v9fs_dentry2v9ses(dentry);
struct p9_fid *fid = v9fs_fid_lookup(dentry);
struct p9_wstat *st;
char *res;
p9_debug(P9_DEBUG_VFS, "%pd\n", dentry);
p9_debug(P9_DEBUG_VFS, " %pd\n", dentry);
retval = -EPERM;
v9ses = v9fs_dentry2v9ses(dentry);
fid = v9fs_fid_lookup(dentry);
if (IS_ERR(fid))
return PTR_ERR(fid);
return ERR_CAST(fid);
if (!v9fs_proto_dotu(v9ses))
return -EBADF;
return ERR_PTR(-EBADF);
st = p9_client_stat(fid);
if (IS_ERR(st))
return PTR_ERR(st);
return ERR_CAST(st);
if (!(st->mode & P9_DMSYMLINK)) {
retval = -EINVAL;
goto done;
p9stat_free(st);
kfree(st);
return ERR_PTR(-EINVAL);
}
res = st->extension;
st->extension = NULL;
if (strlen(res) >= PATH_MAX)
res[PATH_MAX - 1] = '\0';
/* copy extension buffer into buffer */
retval = min(strlen(st->extension)+1, (size_t)buflen);
memcpy(buffer, st->extension, retval);
p9_debug(P9_DEBUG_VFS, "%pd -> %s (%.*s)\n",
dentry, st->extension, buflen, buffer);
done:
p9stat_free(st);
kfree(st);
return retval;
}
/**
* v9fs_vfs_follow_link - follow a symlink path
* @dentry: dentry for symlink
* @nd: nameidata
*
*/
static void *v9fs_vfs_follow_link(struct dentry *dentry, struct nameidata *nd)
{
int len = 0;
char *link = __getname();
p9_debug(P9_DEBUG_VFS, "%pd\n", dentry);
if (!link)
link = ERR_PTR(-ENOMEM);
else {
len = v9fs_readlink(dentry, link, PATH_MAX);
if (len < 0) {
__putname(link);
link = ERR_PTR(len);
} else
link[min(len, PATH_MAX-1)] = 0;
}
nd_set_link(nd, link);
return NULL;
}
/**
* v9fs_vfs_put_link - release a symlink path
* @dentry: dentry for symlink
* @nd: nameidata
* @p: unused
*
*/
void
v9fs_vfs_put_link(struct dentry *dentry, struct nameidata *nd, void *p)
{
char *s = nd_get_link(nd);
p9_debug(P9_DEBUG_VFS, " %pd %s\n",
dentry, IS_ERR(s) ? "<error>" : s);
if (!IS_ERR(s))
__putname(s);
return *cookie = res;
}
/**
......@@ -1370,6 +1313,8 @@ v9fs_vfs_symlink(struct inode *dir, struct dentry *dentry, const char *symname)
return v9fs_vfs_mkspecial(dir, dentry, P9_DMSYMLINK, symname);
}
#define U32_MAX_DIGITS 10
/**
* v9fs_vfs_link - create a hardlink
* @old_dentry: dentry for file to link to
......@@ -1383,7 +1328,7 @@ v9fs_vfs_link(struct dentry *old_dentry, struct inode *dir,
struct dentry *dentry)
{
int retval;
char *name;
char name[1 + U32_MAX_DIGITS + 2]; /* sign + number + \n + \0 */
struct p9_fid *oldfid;
p9_debug(P9_DEBUG_VFS, " %lu,%pd,%pd\n",
......@@ -1393,20 +1338,12 @@ v9fs_vfs_link(struct dentry *old_dentry, struct inode *dir,
if (IS_ERR(oldfid))
return PTR_ERR(oldfid);
name = __getname();
if (unlikely(!name)) {
retval = -ENOMEM;
goto clunk_fid;
}
sprintf(name, "%d\n", oldfid->fid);
retval = v9fs_vfs_mkspecial(dir, dentry, P9_DMLINK, name);
__putname(name);
if (!retval) {
v9fs_refresh_inode(oldfid, d_inode(old_dentry));
v9fs_invalidate_inode_attr(dir);
}
clunk_fid:
p9_client_clunk(oldfid);
return retval;
}
......@@ -1425,7 +1362,7 @@ v9fs_vfs_mknod(struct inode *dir, struct dentry *dentry, umode_t mode, dev_t rde
{
struct v9fs_session_info *v9ses = v9fs_inode2v9ses(dir);
int retval;
char *name;
char name[2 + U32_MAX_DIGITS + 1 + U32_MAX_DIGITS + 1];
u32 perm;
p9_debug(P9_DEBUG_VFS, " %lu,%pd mode: %hx MAJOR: %u MINOR: %u\n",
......@@ -1435,26 +1372,16 @@ v9fs_vfs_mknod(struct inode *dir, struct dentry *dentry, umode_t mode, dev_t rde
if (!new_valid_dev(rdev))
return -EINVAL;
name = __getname();
if (!name)
return -ENOMEM;
/* build extension */
if (S_ISBLK(mode))
sprintf(name, "b %u %u", MAJOR(rdev), MINOR(rdev));
else if (S_ISCHR(mode))
sprintf(name, "c %u %u", MAJOR(rdev), MINOR(rdev));
else if (S_ISFIFO(mode))
*name = 0;
else if (S_ISSOCK(mode))
else
*name = 0;
else {
__putname(name);
return -EINVAL;
}
perm = unixmode2p9mode(v9ses, mode);
retval = v9fs_vfs_mkspecial(dir, dentry, perm, name);
__putname(name);
return retval;
}
......@@ -1530,7 +1457,7 @@ static const struct inode_operations v9fs_file_inode_operations = {
static const struct inode_operations v9fs_symlink_inode_operations = {
.readlink = generic_readlink,
.follow_link = v9fs_vfs_follow_link,
.put_link = v9fs_vfs_put_link,
.put_link = kfree_put_link,
.getattr = v9fs_vfs_getattr,
.setattr = v9fs_vfs_setattr,
};
......
......@@ -905,41 +905,24 @@ error:
/**
* v9fs_vfs_follow_link_dotl - follow a symlink path
* @dentry: dentry for symlink
* @nd: nameidata
*
* @cookie: place to pass the data to put_link()
*/
static void *
v9fs_vfs_follow_link_dotl(struct dentry *dentry, struct nameidata *nd)
static const char *
v9fs_vfs_follow_link_dotl(struct dentry *dentry, void **cookie)
{
int retval;
struct p9_fid *fid;
char *link = __getname();
struct p9_fid *fid = v9fs_fid_lookup(dentry);
char *target;
int retval;
p9_debug(P9_DEBUG_VFS, "%pd\n", dentry);
if (!link) {
link = ERR_PTR(-ENOMEM);
goto ndset;
}
fid = v9fs_fid_lookup(dentry);
if (IS_ERR(fid)) {
__putname(link);
link = ERR_CAST(fid);
goto ndset;
}
if (IS_ERR(fid))
return ERR_CAST(fid);
retval = p9_client_readlink(fid, &target);
if (!retval) {
strcpy(link, target);
kfree(target);
goto ndset;
}
__putname(link);
link = ERR_PTR(retval);
ndset:
nd_set_link(nd, link);
return NULL;
if (retval)
return ERR_PTR(retval);
return *cookie = target;
}
int v9fs_refresh_inode_dotl(struct p9_fid *fid, struct inode *inode)
......@@ -1006,7 +989,7 @@ const struct inode_operations v9fs_file_inode_operations_dotl = {
const struct inode_operations v9fs_symlink_inode_operations_dotl = {
.readlink = generic_readlink,
.follow_link = v9fs_vfs_follow_link_dotl,
.put_link = v9fs_vfs_put_link,
.put_link = kfree_put_link,
.getattr = v9fs_vfs_getattr_dotl,
.setattr = v9fs_vfs_setattr_dotl,
.setxattr = generic_setxattr,
......
......@@ -12,14 +12,13 @@
#include "autofs_i.h"
static void *autofs4_follow_link(struct dentry *dentry, struct nameidata *nd)
static const char *autofs4_follow_link(struct dentry *dentry, void **cookie)
{
struct autofs_sb_info *sbi = autofs4_sbi(dentry->d_sb);
struct autofs_info *ino = autofs4_dentry_ino(dentry);
if (ino && !autofs4_oz_mode(sbi))
ino->last_used = jiffies;
nd_set_link(nd, d_inode(dentry)->i_private);
return NULL;
return d_inode(dentry)->i_private;
}
const struct inode_operations autofs4_symlink_inode_operations = {
......
......@@ -42,8 +42,7 @@ static struct inode *befs_iget(struct super_block *, unsigned long);
static struct inode *befs_alloc_inode(struct super_block *sb);
static void befs_destroy_inode(struct inode *inode);
static void befs_destroy_inodecache(void);
static void *befs_follow_link(struct dentry *, struct nameidata *);
static void *befs_fast_follow_link(struct dentry *, struct nameidata *);
static const char *befs_follow_link(struct dentry *, void **);
static int befs_utf2nls(struct super_block *sb, const char *in, int in_len,
char **out, int *out_len);
static int befs_nls2utf(struct super_block *sb, const char *in, int in_len,
......@@ -80,11 +79,6 @@ static const struct address_space_operations befs_aops = {
.bmap = befs_bmap,
};
static const struct inode_operations befs_fast_symlink_inode_operations = {
.readlink = generic_readlink,
.follow_link = befs_fast_follow_link,
};
static const struct inode_operations befs_symlink_inode_operations = {
.readlink = generic_readlink,
.follow_link = befs_follow_link,
......@@ -403,10 +397,12 @@ static struct inode *befs_iget(struct super_block *sb, unsigned long ino)
inode->i_op = &befs_dir_inode_operations;
inode->i_fop = &befs_dir_operations;
} else if (S_ISLNK(inode->i_mode)) {
if (befs_ino->i_flags & BEFS_LONG_SYMLINK)
if (befs_ino->i_flags & BEFS_LONG_SYMLINK) {
inode->i_op = &befs_symlink_inode_operations;
else
inode->i_op = &befs_fast_symlink_inode_operations;
} else {
inode->i_link = befs_ino->i_data.symlink;
inode->i_op = &simple_symlink_inode_operations;
}
} else {
befs_error(sb, "Inode %lu is not a regular file, "
"directory or symlink. THAT IS WRONG! BeFS has no "
......@@ -467,8 +463,8 @@ befs_destroy_inodecache(void)
* The data stream become link name. Unless the LONG_SYMLINK
* flag is set.
*/
static void *
befs_follow_link(struct dentry *dentry, struct nameidata *nd)
static const char *
befs_follow_link(struct dentry *dentry, void **cookie)
{
struct super_block *sb = dentry->d_sb;
struct befs_inode_info *befs_ino = BEFS_I(d_inode(dentry));
......@@ -478,33 +474,20 @@ befs_follow_link(struct dentry *dentry, struct nameidata *nd)
if (len == 0) {
befs_error(sb, "Long symlink with illegal length");
link = ERR_PTR(-EIO);
} else {
befs_debug(sb, "Follow long symlink");
link = kmalloc(len, GFP_NOFS);
if (!link) {
link = ERR_PTR(-ENOMEM);
} else if (befs_read_lsymlink(sb, data, link, len) != len) {
kfree(link);
befs_error(sb, "Failed to read entire long symlink");
link = ERR_PTR(-EIO);
} else {
link[len - 1] = '\0';
}
return ERR_PTR(-EIO);
}
nd_set_link(nd, link);
return NULL;
}
static void *
befs_fast_follow_link(struct dentry *dentry, struct nameidata *nd)
{
struct befs_inode_info *befs_ino = BEFS_I(d_inode(dentry));
befs_debug(sb, "Follow long symlink");
nd_set_link(nd, befs_ino->i_data.symlink);
return NULL;
link = kmalloc(len, GFP_NOFS);
if (!link)
return ERR_PTR(-ENOMEM);
if (befs_read_lsymlink(sb, data, link, len) != len) {
kfree(link);
befs_error(sb, "Failed to read entire long symlink");
return ERR_PTR(-EIO);
}
link[len - 1] = '\0';